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To obtain the most efficient and 
successful research results, scientists 
need robust and reliable equipment. 


Growing up in rural Greece, nature 
captivated Dr. Efthymia lliana 
Matthaiou. As she says, "I especially 
like finding ways to answer 
questions.” When her grandfather 
was diagnosed with cancer and 
there were limited treatment options, 
Matthaiou decided to pursue a 


Dr. Efthymia liana Matthaiou 


career in translational research. 

In 2020, after completing her postdoctoral training, Matthaiou started an 
academic scientist position at Stanford University, where she investigates host- 
pathogen interactions in the airways. She is particularly interested in patients with 
cystic fibrosis. “Unfortunately, this patient population is prone to infections,” she 
explains. 

Like other life science labs, Matthaiou’s lab requires a variety of equipment, 
ranging from an incubator and biosafety cabinet to an ultralow temperature 
freezer and liquid-nitrogen storage. At the start, though, she needed to set up her 
laboratory. “We were building up a new lab,” she says. “We had zero equipment.” 


Equipping a lab 

An indispensable facet of Matthaiou’s research is the collection and safe storage of 
samples. “We are collecting patient samples on a daily basis,” she says. "These are 
precious human samples that cannot be replaced.” 

To maintain these samples in stable conditions, Matthaiou chose a Thermo 
Scientific -80°C TSX Upright Freezer. “We don't want to have temperature 
fluctuations, we need reliable equipment,” she explains. 

For samples that need to be stored at lower temperatures, Matthaiou opted for 
a Thermo Scientific CryoPlus Vapor Phase LN, (Liquid Nitrogen) Storage System. 
“What | love about this is the autofill liquid nitrogen feature,” she says. “The vapor 
phase platform minimizes the chances of sample contamination. Plus, it offers 
easier access to the samples and may defend researchers from the risks of 
operating with liquid nitrogen.” 
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Decontamination is essential when culturing infectious agents. For that purpose, 
Matthaiou selected a Thermo Scientific Forma Steri-Cycle CO, Incubator. A key 
feature that she notes is high-temperature decontamination. As she points out: “The 
high-temperature sterilization cycle is essential for decontamination and simplifies 
routine cleaning.” 

When working with human samples and infectious agents, the safety of the 
laboratory personnel is a priority, Matthaiou explains. “We're working with airborne 
pathogens,” she says. "We need to work in a safe environment, and BSL-2 [biosafety 
level 2] biosafety cabinets are essential for our safety.” With Matthaiou’s choice of 
Thermo Scientific’s 1300 Series Class Il, Type A2 Biological Safety Cabinets, she and 
her colleagues got safety, energy efficiency, and adjustable ergonomics. “We are 
working long hours under biosafety cabinets, and | want to make sure that my team 
is working in a safe and comfortable environment,” she says. “The Thermo Scientific 
1300 series has armrests, and the height is easy to adjust.” 

As Denzil Vaughn—vice president and general manager, Growth, Protection, 
and Separation at Thermo Fisher Scientific—says, "We are proud that Dr. Matthaiou 
selected our products to help further her research into resolving the threat of 
compromised immunity and pulmonary function.” 


Clearing pathogens 
Having established a well-equipped lab, Matthaiou hopes to discover more about 
host-pathogen interactions in patients with cystic fibrosis who received a lung 
transplant. In this patient population, she wants to identify the factors that cause 
increased risk for infections. “How does this genetic defect and the transplant 
microenvironment impact the innate immune system?" she asks. “The reasons 
why these patients cannot clear the pathogens are not fully understood, and these 
infections drive the rejection of the transplant.” 

For now, Matthaiou works to understand the basic biology in these patients. 
“Later on," she says, "I hope to develop therapies for these patients.” 
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EDITORIAL 


A key time for UK-Europe science 


he opening line of a recent Financial Times ar- 
ticle put it best: “Relations between the UK and 
EU badly need a reset.” Although the article was 
mostly about geopolitics, the disconnect also 
applies to science and the current uncertainty 
about whether the UK will remain an associated 
partner in European Union (EU) research pro- 
grams such as Horizon Europe. In the post-Brexit era, 
and with a new UK Prime Minister to be named shortly, 
the UK and EU should be considering how best to maxi- 
mize the potential of the numerous brilliant scientists, 
technicians, academics, and clinicians working in the 
universities and research institutes of all European 
countries, including the UK. 

An unintended casualty of the UK’s withdrawal from 
the EU was the country’s ability to participate fully in 
the collaborative ecosystem of research and innovation 
that had evolved during the 
UK’s 47-year membership in 
the organization. British uni- 
versities, and the staff and stu- 
dents working and studying 
in them, have been highly de- 
sired partners. These interac- 
tions, which could continue if 
the UK is an associated coun- 
try, are currently at great risk 
because of the complexities of 
the post-Brexit negotiations. 
Hopefully, after last month’s 
tumult surrounding the res- 
ignation of Boris Johnson 
and the uncertainty about who will be the next Prime 
Minister, there will be a calm reappraisal of the risks of 
going it alone for both the UK and the European con- 
tinent. What could emerge is a renewed effort to find a 
negotiated solution that allows the UK to continue to 
maximize its potential as a fully engaged contributor to 
European research and innovation. 

To its credit, the current UK government has worked 
to promote and enhance the country’s own scientific 
endeavors by increasing current funding and pledging 
more in the future, and by enhancing visa schemes and 
other immigration rules to continue to attract and re- 
tain talented individuals from home and abroad. 

There has also been work on a so-called “Plan B” for the 
eventuality that the UK fails to associate with Horizon Eu- 
rope, the EU’s key research funding program with a bud- 
get of 95.5 billion euros. However, this would be a poor 
second best: Witness Switzerland’s similar approach a 
few years ago, which left its researchers seriously isolated. 


“It’s time to untangle 
science from post- 


Brexit geopolitics so 
that European science 
can thrive.” 


At the same time, the UK government recognizes 
that some collaborations can pose national security 
risks, particularly with institutions in countries whose 
governments the UK disagrees with. This has led to a 
clamor among a subset of politicians for more legisla- 
tion to “control” or “manage” the country’s universities. 
I have been involved in some of these discussions and 
have been heartened by many experts in UK security 
agencies and in parts of the UK government who recog- 
nize the value of these partnerships and of keeping risk 
mitigation measures proportionate and balanced. The 
UK could learn from experiences in Australia and the 
United States where similar recent legislation related 
to national security has stymied research and innova- 
tion. For example, Australian universities expressed 
concern about their autonomy and about limitation of 
their abilities to deliver societal benefits. 

The university sector in the 
UK has warmly welcomed the 
creation of the Research Col- 
laboration Advice Team, which 
will provide a single point of 
contact with UK government 
and security agencies. I and 
others will continue to work 
with them and with the se- 
curity services to understand 
risk, disseminate good prac- 
tices, and provide an early 
warning system when real 
dangers are recognized. 

A mature, two-way relation- 
ship between government and the university sector is in 
everyone's best interests. A similar maturity should be 
brought to finding solutions for the challenges posed by 
securing the UK’s association with Horizon Europe and 
with other EU programs. 

Without a “reset” of UK-EU scientific relations, the 
“prain drain” from the UK—which has already started, 
with at least 19 researchers funded by the European 
Research Council recently relocating to EU countries 
to keep their funding—will become an avalanche. 
The role of the UK in the cohesion and productivity 
of European science will be the victim, with serious 
implications for global science capability. There is an 
opportunity for the UK government and the European 
Commission to prevent this now. It’s time to untangle 
science from post-Brexit geopolitics so that European 
science can thrive. 


—Peter Mathieson 
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&G |t is embarrassing for the F.D.A. ... to have its employees 
go to acompany that is a leading manufacturer of death. 99 


Micah Berman, a public health law expert at Ohio State University, commenting in The New York 


Times after the chief of the office of science in the Food and Drug Administration's 
Center for Tobacco Products moved to a job with tobacco giant Philip Morris International. 


Edited by Shraddha Chakradhar 


AlphaFold 
predicted the 
structure of a 
protein from 
the thale cress 
plant that may 
confer disease 
resistance. 


ARTIFICIAL INTELLIGENCE 


Al reveals structures of nearly all proteins 


n a potentially transformative event for drug development and bio- 
logy studies, the artificial intelligence (AI) company DeepMind last 
week unveiled the likely structures of nearly all known proteins, more 
than 200 million in total, from organisms ranging from bacteria to 
humans. The structural bounty comes from AlphaFold, one of the 
new AI programs that have cracked the protein-folding problem and 
learned how to accurately derive 3D shapes of proteins from their amino 
acid sequences. DeepMind released some 350,000 predicted structures 
last year and AlphaFold has since continued to churn out new structures, 
taking about 10 to 20 seconds per protein, according to the company. The 
latest structures were released into an existing database through a part- 
nership with the European Molecular Biology Laboratory’s European 
Bioinformatics Institute. “With this new addition of structures illumi- 
nating nearly the entire protein universe, we can expect more biological 
mysteries to be solved each day,’ Eric Topol, director of the Scripps 
Research Translational Institute, tweeted about the achievement. 
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Boost for heart gene cures 


HEART DISEASE | The British Heart 
Foundation will award £30 million 

($36 million) over 5 years to an inter- 
national team to develop genetic cures 
for some inherited heart diseases called 
cardiomyopathies. The group, dubbed 
CureHeart, won over three others 
shortlisted by the Big Beat Challenge, a 
competition launched in 2019 to fund 
transformative heart disease research. The 
team aims to use one-time injections of 
gene-editing tools to precisely correct or 
silence mutations that cause heart muscle 
cells to produce too little or a harmful 
form of a needed protein. These cardio- 
myopathies affect one in every 250 people, 
putting them at risk for heart attacks 

and heart failure; some will need a heart 
transplant. Within 5 years, CureHeart 
members in the United States, United 
Kingdom, and Singapore hope to develop 
one or more treatments to the point 

that companies will pick them up for 
clinical testing. 


Racism takes a toll on memory 


HEALTH INEQUITIES | Two studies 
presented this week at the Alzheimer’s 
Association International Conference in 
San Diego find people who experienced 
racism and discrimination have lower 
Memory scores, a symptom of demen- 
tia. Systemic inequities such as limited 
health care access are likely responsible. 
In an analysis of nearly 1000 middle- 
aged Black, Latino, and white adults in 
the United States, researchers found past 
experiences of individual and structural 
racism, such as residential segregation, 
were correlated with poorer episodic 
memory—the ability to recall personal 
events in one’s life. Black individuals, 
who are more likely to have Alzheimer’s 
and other dementias, were the most 
affected. In the other study, of almost 450 
Asian, Black, Latino, white, and multi- 
racial people ages 90 or older, those who 
experienced discrimination throughout 
their lives had lower semantic memory— 
general world knowledge accumulated 
over time—compared with those who 
experienced little to no discrimination. 
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AWorld AIDS Day 
event in Indonesia 
in December 

2021 was held 

to raise awareness 
of the disease. 


China replaces its CDC chief 


INFECTIOUS DISEASE | Virologist George 
Gao, who helped forge China’s response to 
the COVID-19 pandemic, was replaced as 
director of the Chinese Center for Disease 
Control and Prevention (CCDC) on 26 July. 
Internationally connected and respected, 
Gao, who led CCDC since 2017, was more 
outspoken than most Chinese scientists 
and promoted international cooperation. 
CCDC has said Gao, 60, was replaced 
because of his age, but observers suspect 
the change is part of a wider bureaucratic 
shake-up to strengthen political control 
over the agency. In an email to Science, 
Gao wrote it is “hard to say” what comes 
next in his career but hinted at a continu- 
ing interest in public health, noting that 
emerging diseases and climate change are 
“two of the most important issues mankind 
faces.” Gao’s successor is Shen Hongbing, a 
public health researcher and former presi- 
dent of Nanjing Medical University. 


Mars mission gets revamped 


sPACE | NASA and the European Space 
Agency announced last week they will make 
greater use of the Mars-based Perseverance 
rover—with helicopters as a backup—to 
retrieve samples from the planet, in a mis- 
sion launching later this decade. Since 2021, 
Perseverance has been collecting samples of 
Mars rocks and sealing them in finger-size 
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tubes. The agencies originally planned to 
send a new rover to gather the tubes for 
return, but are now betting Perseverance 
will be fit enough to deliver them in person. 
When a lander arrives around 2030, it will 
use an arm to load 31 tubes into a rocket 
that will launch and rendezvous with 

an orbiter waiting to return home to Earth. 
If Perseverance shows signs of faltering 
before then, operators can order it to drop 
its cargo. Then two small choppers brought 
by the lander, resembling Perseverance’s 
hovering sidekick Ingenuity, will collect the 
dropped tubes. Either way, the most expen- 
sive half-kilogram of soil in history—given 
the $7 billion mission cost—will land in 
Utah in 2033. 


Nepal counts record tiger numbers 


CONSERVATION | Wild tiger numbers in 
Nepal have nearly tripled since 2009, the 
country announced last week. In 2010, 
with populations declining because of 
poaching and habitat loss, countries with 
tigers pledged to increase the number of 
wild animals from 3200 to more than 
7000 by this year. In December 2021, a 
conservation group estimated fewer than 
5600 across 13 Asian countries. Yet with 

a wildlife survey counting 355 of the big 
cats within its borders, Nepal has greatly 
exceeded its 2022 goal of 250 such animals. 
Tiger populations are also increasing in 
China and Thailand. Some of the apparent 
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Progress toward ending 
AIDS has stalled 


he world’s response to the HIV/ 
AIDS pandemic is faltering badly in 
the face of declines in spending and 
the COVID-19 pandemic, accord- 
ing to an annual update from the 
Joint United Nations Programme on HIV/ 
AIDS (UNAIDS) released last week. In 
a campaign announced in 2015 to “end 
AIDS as a public health threat” by 2030, 
UNAIDS set targets for 2025 that the new 
report finds are far from being met. Last 
year, 1.5 million people became infected 
with HIV, 1 million more than the 2025 
target. Of the 38.4 million people living 
with the virus in 2021, 10 million are still 
not receiving lifesaving antiretroviral 
drugs, and last year saw the lowest num- 
ber of new people starting treatment ina 
decade. Alarmingly, UNAIDS notes, 52% 
of infected children aren't being treated. 


global increase may be due to improved 
survey methods and tools such as remote 
cameras. But in Nepal, the boost is largely 
thanks to conservationists’ focus on habi- 
tat restoration, such as replanting forests 
as corridors for tigers. Nepal has also 
reduced tiger deaths by helping villagers 
better defend livestock from the cats with 
fences and compensating farmers when 
tigers kill their animals. 


Senate weighs in 
on research spending 


The U.S. Senate last week released proposed 2023 
budgets for key research agencies that are 

close to levels agreed on in June by the House 

of Representatives (see below, $ billions). But final 
numbers likely won't be adopted until after the 


November elections. 
BIDEN 


2023 
2022 REQUEST 
National Institutes 45 50.5 475 47 
of Health core 
ARPA-Health 1 5 2.7 1 


National Science 8.8 10.5 9.6 10.3 
Foundation 


HOUSE SENATE 
BILLS BILLS 


NASA science 76 8 79 8 
Department of 75 78 8 8 
Energy science 
NIST core labs 09 1 1 1 
EPA science 07 09 0.9 0.9 
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Innovation bill will reshape science agencies 


CHIPS and Science Act creates NSF tech directorate and boosts applied research 


By Jeffrey Mervis 


massive bill that Congress completed 

last week aims high: It envisions 

a 5-year, $280 billion investment 

to keep the United States ahead of 

China in a global competition for 

technological preeminence. But re- 
searchers won’t see a surge of new money 
anytime soon. Instead, the CHIPS and 
Science Act, which President Joe Biden 
is expected to sign into law, will result in 
some of the biggest changes in U.S. innova- 
tion policy in more than a decade, with a 
greater emphasis on applied research and 
geographic diversity in funding, and closer 
scrutiny of foreign collaborations. 

The 1054-page bill calls for more than 
doubling the annual budget of the Na- 
tional Science Foundation (NSF)—now 
$8.8 billion—over 5 years. It would also grow 
the $7.5 billion Office of Science at the De- 
partment of Energy (DOE) by 45% and boost 
the $850 million research account at the 
National Institute of Standards and Technol- 
ogy by 50%. But that money is “authorized,” 
not committed, and congressional spending 
panels must decide each year whether to ap- 
propriate the additional dollars. 

At the same time, the bill makes sig- 
nificant changes in how those agencies 
operate—directives that will go into effect 
immediately. For example, it gives NSF the 
legislative authority to create a technology 
directorate that would nurture innovations 
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with commercial potential and social im- 
pact, adding to the agency’s traditional mis- 
sion of supporting basic research. The new 
directorate will focus on both emerging 
technologies, such as artificial intelligence 
and quantum information science, and so- 
cietal challenges such as combating climate 
change and training a tech-savvy workforce. 

Poorer, more rural states will benefit from 
a new requirement that both NSF and DOE 
increase the share of research spending go- 
ing to institutions there to 20%. (NSF is now 
at roughly 13%, DOE doesn’t track the figure.) 
The bill also directs the Department of Com- 
merce to create regional technology centers 
designed to accelerate economic growth. 

Other sections of the bill address grow- 
ing concerns that China has been stealing 
or unfairly benefiting from research done in 
the United States. In general, these research 
security provisions tighten oversight of in- 
teractions between U.S. and foreign scien- 
tists, and with foreign governments. 

For example, the legislation prohibits U.S.- 
based scientists with federal funding from 
participating in a foreign talent recruitment 
program sponsored by China and Russia and 
bans federal employees from participating 
in such programs from any country. In ad- 
dition, institutions receiving federal research 
dollars must now provide research security 
training to faculty and staff. And NSF has 
been ordered to set up an independent fo- 
rum to discuss how to strengthen research 
security in academic settings. 


Although U.S. higher education orga- 
nizations hailed passage of the CHIPS 
and Science Act, which appropriates 
$52 billion over 5 years to help the semi- 
conductor industry, they are disappointed 
legislators rejected their pleas for at least 
$10 billion in funding to jump-start the 
bill’s grand vision. (The only immediate 
increase for research agencies is a 5-year, 
$200 million appropriation for NSF to 
boost training in microelectronics.) Advo- 
cates fear the authorized funding could be 
an empty promise. 

“We have been here before,” says Peter 
McPherson, president of the 248-member 
Association of Public and Land-grant Uni- 
versities, referring to the 2007 America 
COMPETES Act, which authorized billions 
of additional funding for NSF and DOE that 
never materialized. “The CHIPS Act must 
be step one in a process that ultimately in- 
cludes Congress delivering the funding that 
will accomplish the goals of the legislation.” 

Higher education groups and others were 
relieved that legislators dropped several 
contentious research security provisions, 
including a proposed White House research 
security office that critics saw as a threat to 
legitimate research collaborations, a dupli- 
cation of efforts already underway across 
the government, and an invitation to target 
Asian American scientists. But scientists 
are unhappy that immigration provisions 
designed to retain high-tech talent were re- 
moved. One would have made it easier for 
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Senate Majority Leader Chuck Schumer (D-NY, center) 
celebrates Senate passage of the CHIPS and Science 
Act on 27 July with other Senate backers. 


foreign-born scientists to stay after earning 
an advanced degree from a U.S. institution. 
Another would have created a new visa cat- 
egory for foreign scientists setting up com- 
panies based on their research. 

The bill went through numerous 
iterations—and names—since Senate Ma- 
jority Leader Chuck Schumer (D-NY) pro- 
posed a $100 billion technology directorate 
at NSF in a November 2019 speech. In May 
2020, he joined with Senator Todd Young 
(R-IN) to introduce the Endless Frontiers 
Act, named in homage to Science, the End- 
less Frontier: A Report to the President on 
a Program for Postwar Scientific Research, 
the 1945 report that led to NSF’s creation 
5 years later. 

In June 2021, a greatly expanded version, 
rebranded the U.S. Innovation and Compe- 
tition Act to reflect its increased empha- 
sis on beating China, demanded by many 
Republicans, passed the Senate 68 to 32. 
In February, the House of Representatives 
revived the America COMPETES moniker 
for legislation adopted on a straight party 
line vote. 

It had become the CHIPS and Science 
Act by the Senate’s final 64-to-33 vote on 
27 July, a tally that included 17 Republicans. 
“This is a Sputnik moment, only instead 
of Russia it’s China, in which America re- 
alized that another rival power would get 
way ahead of us if we didn’t pull out all the 
stops,” Schumer said. 

Some scientists had questioned Schumer’s 
original vision to beef up NSF by creating 
a technology directorate focused on applied 
research that would be much larger than 
NSF’s core programs. They also worried 
about its proposed quasi-independent sta- 
tus within the agency. 

The final legislation dramatically shrinks 
the new directorate and simply makes it the 
seventh research directorate at NSF. Still, 
by 2027 legislators hope it will receive more 
than one-fifth of NSF’s aspirational budget 
of $18.9 billion. Some academic leaders re- 
main worried that NSF may favor the new 
directorate over existing programs if NSF’s 
overall budget doesn’t grow. 

On 28 July, House Democrats withstood 
a last-ditch effort by Republican leaders to 
kill the bill, as 24 Republicans contributed 
to the final winning margin of 243 to 187. 
“With this legislation, we are ushering in a 
bold and prosperous future for American 
science and innovation,” said Representa- 
tive Eddie Bernice Johnson (D-TX), chair of 
the science committee and a key player in 
formulating the bill. 
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Ordinary computer matches 
Google’s quantum computer 


High-profile claim of “quantum supremacy” fades 


By Adrian Cho 


f the quantum computing era dawned 

3 years ago, its rising sun may have 

ducked behind a cloud. In 2019, Google 

researchers claimed they had passed a 

milestone known as quantum supremacy 

when their quantum computer Sycamore 
performed in 200 seconds an abstruse calcu- 
lation they said would tie up a supercomputer 
for 10,000 years. Now, scientists in China 
have done the computation in a few hours 
with ordinary processors. A supercomputer, 
they say, could beat Sycamore outright. 

“T think they’re right that if they had access 
to a big enough supercomputer, they could 
have simulated the ... task in a matter of sec- 
onds,’ says Scott Aaronson, a computer sci- 
entist at the University of Texas, Austin. The 
advance takes a bit of the shine off Google's 
claim, says Greg Kuperberg, a mathematician 
at the University of California, Davis. “Get- 
ting to 300 feet from the summit is less excit- 
ing than getting to the summit.” 

Still, the promise of quantum computing 
remains undimmed, Kuperberg and others 
say. And Sergio Boixo, principal scientist for 
Google Quantum AI, said in an email the 
Google team knew its edge might not hold 
for very long. “In our 2019 paper, we said that 
classical algorithms would improve,” he said. 
But, “we don’t think this classical approach 


can keep up with quantum circuits in 2022 
and beyond.” 

The “problem” Sycamore solved was de- 
signed to be hard for a conventional com- 
puter but as easy as possible for a quantum 
computer, which manipulates qubits that 
can be set to 0, 1, or—thanks to quantum 
mechanics—any combination of O and 
1 at the same time. Together, Sycamore’s 
53 qubits, tiny resonating electrical cir- 
cuits made of superconducting metal, can 
encode any number from 0 to 2 (roughly 
9 quadrillion)—or even all of them at once. 

Starting with all the qubits set to 0, Google 
researchers applied to single qubits and pairs 
a random but fixed set of logical operations, 
or gates, over 20 cycles, then read out the qu- 
bits. Crudely speaking, quantum waves rep- 
resenting all possible outputs sloshed among 
the qubits, and the gates created interference 
that reinforced some outputs and canceled 
others. So some should have appeared with 
greater probability than others. Over millions 
of trials, a spiky output pattern emerged. 

The Google researchers argued that simu- 
lating those interference effects would over- 
whelm even Summit, a supercomputer at Oak 
Ridge National Laboratory, which has 9216 
central processing units and 27,648 faster 
graphic processing units (GPUs). Researchers 
with IBM, which developed Summit, quickly 
countered that if they exploited every bit of 


Even though conventional processors have bested Google’s Sycamore chip, they won't hold their lead for long. 
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hard drive available to the computer, it could 
handle the computation in a few days. Now, 
Pan Zhang, a statistical physicist at the In- 
stitute of Theoretical Physics at the Chinese 
Academy of Sciences, and colleagues have 
shown how to beat Sycamore in a paper in 
press at Physical Review Letters. 

Following others, Zhang and colleagues 
recast the problem as a 3D mathematical ar- 
ray called a tensor network. It consisted of 
20 layers, one for each cycle of gates, with 
each layer comprising 53 dots, one for each 
qubit. Lines connected the dots to repre- 
sent the gates, with each gate encoded in a 
tensor—a 2D or 4D grid of complex numbers. 
Running the simulation then reduced to, es- 
sentially, multiplying all the tensors. “The ad- 
vantage of the tensor network method is we 
can use many GPUs to do the computations 
in parallel,’ Zhang says. 

Zhang and colleagues also relied on a key 
insight: Sycamore’s computation was far 
from exact, so theirs didn’t need to be either. 
Sycamore calculated the distribution of out- 
puts with an estimated fidelity of 0.2% —just 
enough to distinguish the fingerprintlike 
spikiness from the noise in the circuitry. So 
Zhang’s team traded accuracy for speed by 
cutting some lines in its network and elimi- 
nating the corresponding gates. Losing just 
eight lines made the computation 256 times 
faster while maintaining a fidelity of 0.37%. 

The researchers calculated the output 
pattern for 1 million of the 9 quadrillion 
possible number strings, relying on an in- 
novation of their own to obtain a truly ran- 
dom, representative set. The computation 
took 15 hours on 512 GPUs and yielded the 
telltale spiky output. “It’s fair to say that 
the Google experiment has been simulated 
on a conventional computer,’ says Dominik 
Hangleiter, a quantum computer scientist 
at the University of Maryland, College 
Park. On a supercomputer, the compu- 
tation would take a few dozen seconds, 
Zhang says—10 billion times faster than 
the Google team estimated. 

The advance underscores the pitfalls of 
racing a quantum computer against a con- 
ventional one, researchers say. “There’s an 
urgent need for better quantum suprem- 
acy experiments,’ Aaronson says. Zhang 
suggests a more practical approach: “We 
should find some real-world applications to 
demonstrate the quantum advantage.” 

Still, the Google demonstration was not 
just hype, researchers say. Sycamore re- 
quired far fewer operations and less power 
than a supercomputer, Zhang notes. And 
if Sycamore had slightly higher fidelity, he 
says, his team’s simulation couldn’t have 
kept up. As Hangleiter puts it, “The Google 
experiment did what it was meant to do, 
start this race.” 
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U.S. CLIMATE POLICY 


Ambitious bill leads to 40% cut 
in emissions, models show 


But more action is needed to reach Biden's pledge 
to halve greenhouse gas emissions by 2030 


By Erik Stokstad 


or climate advocates in the United 
States, the past month felt like a roller 
coaster. In early July, negotiations in 
Congress on clean energy legislation 
of historic proportions collapsed, and 
the effort seemed doomed. But back- 
room talks continued and last week key sen- 
ators suddenly announced an agreement 
on a $369 billion bill that would provide 
the most climate funding ever seen in the 
United States. “It was the best kept secret, 
potentially, in Washington history,’ says 
Leah Stokes, a political scientist at the Uni- 
versity of California (UC), Santa Barbara. 

The backers—Senate Majority Leader 
Chuck Schumer (D-NY) and Senator Joe 
Manchin (D-WV)—who had initially balked 
at the cost—announced that the draft bill 
would ensure U.S. carbon dioxide (CO,) 
emissions would fall by 40% by 2030, com- 
pared with 2005. 

Sponsors of the bill, which must still 
pass the full Senate and the House of Rep- 
resentatives, might be expected to oversell 
its impact. But energy and climate model- 
ers have now scrutinized its 725 pages and 
concluded the 40% claim is about on target. 


They plugged major provisions, including 
subsidies for renewable energy and tax cuts 
for electric vehicles, as well as controversial 
incentives for the fossil fuel industry, into 
their models. Three models now agree that 
if the bill’s provisions are carried out, U.S. 
greenhouse gas emissions would fall by per- 
haps 40% by 2030, although only part of 
that stems from the bill alone. One model 
also finds that the renewable energy subsi- 
dies will likely create 1.5 million jobs and 
prevent thousands of premature deaths 
from air pollution, especially in disadvan- 
taged communities. 

“It’s a historic step, no doubt about it,’ 
says Marshall Shepherd, an atmospheric sci- 
entist at the University of Georgia and for- 
mer head of the American Meteorological 
Society. “It really does a lot to enhance the 
transition to a renewable energy economy.” 

U.S. emissions have been falling by about 
1% per year since 2005, when they peaked, 
largely because of replacing coal power 
with wind and solar, as well as natural gas, 
and rising fuel economy in light cars. But 
this pace is nowhere near fast enough to 
meet President Joe Biden’s goal of a 50% 
to 52% cut in emissions by 2030 relative 
to 2005. Officials pledged that dramatic 
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A proposed bill would subsidize renewable 
energy sources, such as this solar farm in Florida, 
but also support some fossil fuel operations. 


reduction as the U.S. contribution to the 
Paris accord’s goal of holding global tem- 
perature rise to 1.5°C. 

Biden’s major effort had been the Build 
Back Better Act, which would have invested 
$560 billion in cutting greenhouse gases but 
died in the Senate after Manchin objected. 
The smaller new bill, called the Inflation Re- 
duction Act of 2022, preserves much of the 
bang for clean energy, says energy systems 
expert Jesse Jenkins of Princeton University’s 
Rapid Energy Policy Evaluation and Analysis 
Toolkit Project, which runs one of the models. 
“T think [Senate staff] did a miraculous job,” 
he says. In particular, the bill provides sub- 
sidies to expand renewable energy and lure 
consumers to buy electric vehicles, solar pan- 
els, and climate-friendly home heat pumps. 

To evaluate the climate impacts of the 
legislation, Jenkins and other 
modelers simulate the entire 
U.S. energy system, from the 
smallest electric vehicles to 
nuclear plants, and add the 
proposed policies to see how 
they impact CO. emissions. 
Scientists also fold in results 
from other models that focus 
on factors such as the impact 
of agricultural policies on 
two other causes of green- 
house warming: methane 
emissions from livestock and 
nitrous oxide released from 
fertilized fields. Modelers put 
everything together to forecast emissions 
trends, says modeler Ben King of the Rho- 
dium Group, an independent research firm. 

Just a day after the bill was released, 
Rhodium posted preliminary estimates 
on its website. The topline result: a 31% 
to 44% reduction in greenhouse gas emis- 
sions from 2005. Compared with current 
policies, that’s an additional drop of 7 to 
9 percentage points. Variables such as the 
price of natural gas account for much of 
the uncertainty: If gas prices drop, utili- 
ties might favor gas over renewable power, 
slowing the decline in carbon emissions. 

This week, the think tank Energy In- 
novation narrowed the range, forecasting 
emissions reductions of 38% to 41%, with 
13% to 17% from the bill alone. And the 
Princeton model estimated about a 42% 
reduction, with 15% from the bill itself. 

All the analyses find the two most im- 
portant factors driving down emissions 
are clean electricity tax credits—which the 
bill provides for at least a decade—and ex- 
panded tax credits for both new and used 
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“The United States 
is really going to 
he a climate leader 
globally if we 
can get this bill 


over the finish line.” 


Leah Stokes, 
University of California, 
Santa Barbara 


electric vehicles. The subsidies will help 
utilities install more capacity from wind 
farms and solar panels and help keep nu- 
clear power plants financially viable as they 
face competition from cheap natural gas. 
Previous analyses had also pointed to green 
electricity generation and transportation 
as crucial to reducing emissions (Science, 
27 May, p. 922). 

Models can have difficulty predict- 
ing human behavior, cautions economist 
Meredith Fowlie of UC Berkeley. “I wouldn’t 
believe any one projected number, but [key] 
models agree in a qualitative sense that this 
is going to bend the trajectory,” she says. 

Other provisions of the proposed bill 
could eventually lead to further CO, reduc- 
tions, such as investment in technologies 
that directly remove carbon from the atmo- 
sphere and capture it from fossil fuel plants. 

The bill also includes some climate- 
unfriendly provisions, apparently added at 
Manchin’s request. It requires the federal 
government to offer several 
lease sales of offshore oil and 
gas resources, with more on 
the table if public lands are 
opened to renewable en- 
ergy efforts like wind farms. 
The leases could boost oil 
and gas production from 
federal lands by an extra 
50 million tons per year in 
2030, according to Energy 
Innovation. Overall, how- 
ever, climate wins out, ana- 
lysts say: For each additional 
ton of CO» from fossil fuels, 
other provisions of the bill 
would reduce emissions by 24 tons. 

The bill must still pass the Senate, where 
Democrats need every possible vote in 
their party, and then it will go back to the 
House. Stokes, who advised Democrats on 
the bill, says she’s hopeful the measure will 
be on Biden’s desk by mid-August. “The 
United States is really going to be a climate 
leader globally if we can get this bill over 
the finish line.” 

The measure won’t be enough, however, 
for the United States to reach its Paris goal 
of a 50% greenhouse emissions reduction 
by 2030. For that, more federal regulation 
and state action will be necessary, King 
and others say. “It’s all hands on deck,” says 
energy and climate modeler John Bistline 
of the Electric Power Research Institute. 

The ultimate—and necessary—goal is 
cutting U.S. emissions to zero, says Emily 
Grubert, a civil engineer and environmen- 
tal sociologist at the University of Notre 
Dame. “People keep talking about this as 
the biggest climate investment in a genera- 
tion. I can only say—I hope not.” 


DEVELOPMENTAL BIOLOGY 


Mouse stem 
cells grown into 
embryo mimics 


Bioreactor lets “embryoids” 
mature long enough for 
multiple organs to form 


By Mitch Leslie 


hat happens in embryonic de- 

velopment is one of nature’s 

best guarded secrets, unfolding 

deep in the mother’s body. Now, 

researchers have opened a new 

window on the process. They’ve 
made artificial mouse embryos from stem 
cells—no sperm or eggs required—and 
used an innovative bioreactor to nurture 
their creations for longer than any previ- 
ous embryo models. The simulated em- 
bryos developed anatomy that matched the 
real thing and “very impressive similarities 
at the cellular level. The right cells arise 
at the right time,” says stem cell biologist 
Niels Geijsen of the Leiden University 
Medical Center, who was not involved in 
the work. 

The feat, reported this week in Cell, 
may allow biologists to delve deeper into 
developmental mechanisms and better un- 
derstand what goes wrong in birth defects. 
And the team’s leader, stem cell biologist 
Jacob Hanna of the Weizmann Institute of 
Science, says that next, he hopes to do the 
same with comparable human stem cells. 

Researchers have already reprised parts 
of early development with embryo mimics 
made from an assortment of mouse or hu- 
man stem cells, including embryonic stem 
(ES) cells, which are derived from normal 
embryos and can form all of a body’s tis- 
sues. They’ve mimicked the blastocyst, the 
simple developmental stage that implants 
in the uterus, and recreated gastrulation, 
when embryos become multilayered. These 
simulated embryos hit a developmental 
wall, however. Their cells begin to special- 
ize but do not coalesce into organs. 

One obstacle has been keeping the ersatz 
embryos alive for more than a few days. 
Last year, Hanna and colleagues unveiled 
a nurturing procedure that allowed them 
to grow standard mouse embryos outside 
of the mother’s body for a record 11 days. 
(Typical mouse gestation is about 20 days.) 
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A key step involves placing the embryos in 
an incubator outfitted with a Ferris wheel- 
like device, which rotates the embryos in- 
side bottles of liquid filled with nutrients 
and growth factors. The setup enables the 
team to precisely control growth condi- 
tions such as oxygen levels. 

Those embryos came from _ fertil- 
ized mouse eggs, however. To determine 
whether the same procedure would allow 
stem cells to transform into full-fledged 
embryos, Hanna’s team mingled basic 
mouse ES cells with ES cell lineages geneti- 
cally altered to spawn tissues outside the 
embryo that shape and support its growth. 
After initially rearing the cell congrega- 
tions on culture plates, the team shifted 
them to rotating bottles on the fifth day. 

By the eighth day, the “embryoids” were 
very similar to 8.5-day-old natural em- 
bryos and boasted a beating heart, distinct 
head and tail ends, the blocklike segments 
that become skeletal muscles, a develop- 
ing brain and spinal cord, and the begin- 
nings of other organs. The researchers 
also measured gene activity in more than 
40,000 embryoid cells, finding all of the 
expected cell types in the correct locations, 
Hanna says. 

For unknown reasons, the artificial em- 
bryos stalled at the eighth day of develop- 
ment. The researchers hope to overcome 
this barrier and extend development even 
further. Still, stem cell-derived embryos 
have an advantage over normal mouse 
embryos for research because the cells are 
available in larger numbers and scientists 
can more easily manipulate them, says 
stem cell biologist Nicolas Rivron of the 
Institute of Molecular Biotechnology of the 
Austrian Academy of Sciences. 

The current procedure for making the sim- 
ulated embryos fails most of the time—less 
than 1% of the initial cell aggregations form 
embryo mimics. But, Hanna notes, “The ad- 
vantage of this technique is that we can make 
millions of aggregates in one batch.” 

Achieving the same feat with human ES 
cells might avoid some of the controversies 
of research on human embryos. “This is 
providing an ethical and technical alterna- 
tive to the use of embryos,” Rivron says. 

Hanna has co-founded a company that 
will investigate whether the approach 
will work with human induced pluripo- 
tent stem cells, which are derived from 
adult cells rather than embryos. Cells and 
tissues in an embryo release factors that 
orchestrate the correct development of 
their neighbors. So growing stem cells into 
artificial embryos first may provide a bet- 
ter way of producing cell types that can be 
transplanted to treat human diseases. It is 
“more physiological,” Hanna says. 
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COVID-19 


Making broader coronavirus 
vaccines is a struggle 


Efforts to protect against future variants or novel 
coronaviruses face funding constraints and other problems 


By Jon Cohen 


here’s a new call from the White House 

to develop vaccines that might protect 

against future SARS-CoV-2 mutants 

or even unknown coronaviruses. “The 

vaccines we have are terrific, but we 

can do better than terrific,’ Ashish 
Jha, White House COVID-19 response coor- 
dinator, said at a vaccine summit last week 
that gathered researchers, companies, and 
government officials. “Ultimately, we need 
vaccines that can protect us no matter what 
Mother Nature throws at us.” 

But no specific funding request to Congress 
was revealed at the summit, or any concrete 
plans, so vaccine developers 
and the public shouldn’t ex- 
pect a second Operation Warp 
Speed, the U.S. government’s 
multibillion-dollar crash 
program to develop the first 
COVID-19 vaccines. And the 
scientific, logistical, and regu- 
latory hurdles for any next- 
generation vaccines are higher. 

Operation Warp Speed proved it was pos- 
sible to race from a newly identified virus 
to safe and effective immunizations in just 
11 months—many times faster than ever 
before. Today, there are a few dozen fledg- 
ling efforts to create vaccines that could 
thwart future SARS-CoV-2 mutants or of- 
fer even broader protection, extending to 
unknown coronaviruses that have yet to 
jump into humans. 

But only a single candidate, developed 
by the U.S. Army, has made it into a phase 
1 clinical trial. “We want to start clinical tri- 
als tomorrow, but there are lots of barriers to 
getting there,” says Yale University immuno- 
logist Akiko Iwasaki, who has a coronavirus 
vaccine candidate that’s administered as a 
nasal spray. 

For starters, funding for these broader 
vaccines remains far tighter than in the Warp 
Speed days: The Coalition for Epidemic Pre- 
paredness Innovations (CEPI) has invested 
amore modest $200 million in 11 efforts run 
by small companies and academics, and the 
US. National Institute of Allergy and Infec- 
tious Diseases (NIAID) has committed just 
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“The sense of 
urgency is 
completely gone.” 


Florian Krammer, 


at Mount Sinai 


$43 million in four pancoronavirus vaccine 
programs. Efforts also face a dearth of mate- 
rials needed to make vaccines, a shortage of 
nonhuman primates on which to test candi- 
dates, and uncertainty about how to assess 
new products in populations that already 
have immune response to SARS-CoV-2. 

White House summits aside, some scien- 
tists contend there’s a deeper barrier. “The 
sense of urgency is completely gone,” says 
Florian Krammer, a virologist at the Icahn 
School of Medicine at Mount Sinai. 

That’s understandable, says Moncef 
Slaoui, the scientific leader of Operation 
Warp Speed and a veteran vaccine devel- 
oper. Even though new SARS-CoV-2 vari- 
ants are eroding the ability of 
today’s COVID-19 vaccines to 
block infection and prevent 
symptomatic disease, those 
vaccines are still prevent- 
ing severe illness and death. 
“Current vaccines are effec- 
tively able to deal with the 
pandemic, because the No. 
1 priority is mortality and 
morbidity,’ Slaoui says. “Pancoronavirus 
vaccines, whatever definition you use for 
them, are about preparedness, rather than 
dealing with the actual pandemic.” 

The “pan” in potential next generation 
coronavirus vaccines is often in the eyes 
of the beholder, as the projects underway 
have a variety of aims. The most modest, 
but still ambitious, goal is to avoid racing 
to create specific boosters for the latest 
SARS-CoV-2 variant and instead to develop 
vaccines that can reliably ward off severe 
disease from any future mutants. 

Lawrence Corey of the Fred Hutchinson 
Cancer Research Center, who co-led the clini- 
cal trials network for Operation Warp Speed, 
wants to go further by developing COVID-19 
vaccines that reduce not just severe disease, 
but the risk of infection and transmission of 
all SARS-CoV-2 spinoffs. “Do we really want 
to have 90,000 COVID deaths [in the United 
States] a year?” Corey asks. 

And he calls for more government back- 
ing so as to not squander the momentum of 
the field. “There are plenty of ideas,’ he says. 
“What's not forthcoming is the commitment.” 


of Medicine 
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The next level of coronavirus vaccines 
shoots for broad protection against sarbeco- 
viruses, the grouping that includes SARS- 
CoV-2 and SARS-CoV-1—the cause of the 
outbreak of severe acute respiratory syn- 
drome 20 years ago—and viral relatives in 
bats and pangolins that could spark the next 
human outbreak. A vaccine with even more 
“breadth” would thwart the beta genus of 
coronaviruses, which includes sarbecoviruses, 
merbecoviruses like the one that causes Mid- 
dle East respiratory syndrome, and two that 
now trigger the common cold. The ultimate 
vaccine would work against all four genera in 
the coronavirus family. 

All of the work in this area is “really quite 
early,’ says Melanie Saville, who heads vac- 
cine R&D at CEPI. “I would classify this as 
high-risk, high-reward research.” 

Kayvon Modjarrad and colleagues at the 
Walter Reed Army Institute of Research 
(WRAIR) have the only pancoronavirus vac- 
cine candidate to reach clinical 
testing so far. Aimed at SARS- 
CoV-2 variants, it uses the same 
coronavirus surface protein, 
spike, as many existing shots 
but tries to improve the way 
it is presented to the immune 
system. The WRAIR candi- 
date contains several copies of 
spike bound to ferritin, a pro- 
tein that normally carries iron 
around the blood. Receptors 
on the surface of antibody- 
making B cells can then “cross- 
link” to these closely arrayed 
spikes, which theoretically leads 
to production of more powerful 
antibodies. In test tube stud- 
ies, the vaccine “neutralized” a 
broad range of SARS-CoV-2 vari- 
ants. WRAIR plans to publish 
data soon from its phase 1 study, but declined 
Science’s request to interview investigators. 

The limited availability of animals to test 
experimental vaccines has stalled other proj- 
ects. Andrew Ward, a structural biologist at 
Scripps Research, says there is “enormous 
competition” for monkeys and the best mouse 
model systems. Operation Warp Speed allo- 
cated research animals to the most promis- 
ing vaccine candidates, but now efforts have 
just become a “free-for-all, not a coordinated 
Manhattan Project,’ says Ward, who, as a re- 
sult, has turned most of his attention from 
coronavirus to paninfluenza vaccines. 

Iwasaki also has had trouble getting mon- 
keys for her studies of nasal sprays, which 
she contends might be able to broaden pro- 
tection by stimulating production of muco- 
sal antibodies with twice as many arms to 
bind viruses as the Y-shaped ones triggered 
by injection. “If there was a government- 


SCIENCE science.org 


organized effort to help us, it would have 
gone much faster,” she says. 

NIAID only funds two investigator-initiated 
grants for pancoronavirus vaccine research. 
One went to Lbachir BenMohamed, an im- 
munologist at the University of California, Ir- 
vine, who has a 5-year, $3.6 million grant and 
had hoped to launch a clinical trial this year 
of a pan-sarbecovirus vaccine. But his team, 
too, has had to wait for access to animal mod- 
els, which it needs to select the most prom- 
ising vaccine candidate. He is now looking 
at 2023—if his team can overcome another 
challenge. 

BenMohamed’s team first analyzed 
sarbecoviruses that have infected humans, 
camels, bats, minks, and pangolins for 
shared genetic sequences. They made pro- 
teins that reflect the conserved regions and 
linked them together. Like several existing 
vaccines, their vaccine encodes these con- 
served proteins in messenger RNA and relies 


The U.S. Army has a phase 1 trial of a pancoronavirus vaccine underway. 


on the body to turn that code back into pro- 
tein. But a shortage of the lipid shells needed 
to enclose the RNA has slowed progress. 

Biochemist Pamela Bjorkman of the 
California Institute of Technology says 
her team’s pan-sarbecovirus vaccine can- 
didate likely won’t enter the clinic until 
perhaps 2024. That team has plucked a 
critical portion of spike from eight differ- 
ent sarbecoviruses—the so-called recep- 
tor-binding domain—and stitched them 
together into a “mosaic.” Her team has 
faced manufacturing challenges, and only 
received substantial support recently, with 
CEPI announcing on 5 July a grant of up to 
$30 million to take the candidate through 
phase 1 trials. “This is taking longer than 
we'd hoped,” Bjorkman says. 

To prove their worth, pancoronavi- 
rus vaccines will have to travel a much 
rougher road than the first COVID-19 


shots. The people in trials of the first CO- 
VID-19 vaccines had no specific immunity 
to SARS-CoV-2, making it straightfor- 
ward to assess whether the shots pro- 
vided protection. Today, most everyone 
has been vaccinated, infected with the 
virus, or both. Even the lowest hurdle for 
a pancoronavirus vaccine—proof of pro- 
tection against all known SARS-CoV-2 
variants—will be difficult to establish, pre- 
dicts Barney Graham, who long worked on 
pancoronavirus vaccines at NIAID. 

Graham, who is now at Morehouse Col- 
lege, also notes that the immune response 
to any new SARS-CoV-2 vaccine will be 
skewed by the immune system’s memory 
of the first viral proteins it encountered, 
whether through vaccination or infection 
with one of the many variants that have 
circulated. So assessing a new vaccine’s 
protection may require trials in people 
who have “no competition in the immune 
system’—which in many 
countries would now mean 
infants. “There are a lot of 
fundamental biological and 
immunology questions to an- 
swer,’ Graham says. “It’s not 
going to go as fast as before.” 

Exactly how these vaccines 
would be used—given as pe- 
riodic boosters that top up 
existing SARS-CoV-2 immu- 
nity and add breadth or held 
in reserve, for when a new 
coronavirus surfaces—also 
remains a big question. And 
developing a panvaccine as in- 
surance is a risky proposition, 
Graham says. A candidate vet- 
ted in clinical trials and sitting 
on the shelf might only have 
limited efficacy against, say, a 
SARS-CoV-3. Graham says if he had to make 
a choice with limited resources, he’d prefer 
investing in having “knowledge on the shelf” 
to quickly make a pathogen-specific vaccine 
rather than paying to have a panvaccine at 
the ready that could only “hold the fort until 
you get the real thing a few months later.” 

Slaoui says if we “disrupt the economic 
model” for R&D and take a Warp Speed ap- 
proach, pancoronavirus vaccine develop- 
ment could be “up to 10 times faster.” But he 
contends the best bang for buck, especially 
when it comes to thwarting an entirely novel 
coronavirus, will come from investing heav- 
ily in building vaccine manufacturing plants 
that can quickly make vetted candidates 
against a new threat. “The day somebody 
comes up with a panvaccine that actually 
works, we should celebrate it,’ Slaoui says. 
“But we will only know that when another 
pandemic comes and we try it.” 
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FIERY INVASIONS 
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Around the world, flammable invasive grasses are 
increasing the risks of damaging wildfires sy Warren Cornwall 


or decades, eastern Oregon’s 
scablands—rocky patches of open 
terrain—were a refuge for people 
fighting wildfires in the surrounding 
forests. The thin soil and sparse vege- 
tation offered little fuel for the flames, 
creating an oasis from which fire- 
fighters could operate and a barrier 
that could help halt a fire’s spread. 
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That all changed in 2015. After lightning 
sparked a fire near a steep-walled canyon, 
the blaze unexpectedly raced across scab- 
lands so quickly that firefighters struggled 
to catch up. In the end, the Corner Creek 
Fire scorched more than 11,000 hectares. 
And Jeff Priest, who has spent more than 
2 decades fighting fires in Oregon for the 
US. Forest Service (USFS), realized he had 


a new problem on his hands: the arrival 
of an invasive, shin-high grass known as 
Ventenata dubia. The plant created shaggy 
golden carpets of dry foliage, transforming 
once fire-resistant scablands into flame- 
friendly corridors. 

“We knew it was coming,” Priest says 
about the annual commonly called wire- 
grass, which is native to countries sur- 
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rounding the Mediterranean. “But all of a 
sudden, it was there.” 

Ventenata’s spread into the forests of the 
northwestern United States is just the lat- 
est chapter in a phenomenon reshaping 
ecosystems—and  wildfire—around the 
globe. In northern Australia, invasive 
gamba grass from Africa fuels intense blazes 
that rip through eucalyptus groves. In Bra- 
zil, molasses grass from Africa turns vast 
swaths of the savanna known as the Cer- 
rado into fire-prone grassland. In the west- 
ern United States, two Old World grasses 
are creating ecological mayhem: Buffelgrass 
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Australian cattle ranchers planted gamba grass 
from Africa to create forage. It forms dense walls 
of vegetation that have become a major fire hazard. 


feeds fires in the Sonoran Desert that torch 
iconic saguaro cacti, while blaze-tolerant 
cheatgrass crowds out native sagebrush in 
the high desert known as the Great Basin. 

Even as catastrophic wildfires that roar 
through towering treetops capture the pub- 
lic’s attention, ecologists have been paying 
increasing attention to this less conspicu- 
ous trend: how seemingly modest non- 
native grasses are allying with fire to eat 
away at dry forest and savanna ecosystems. 

These invasive grasses can hijack fire 
to create a self-reinforcing cycle, explains 
Carla D’Antonio, an ecologist at the Univer- 
sity of California, Santa Barbara, who has 
studied the phenomenon for more than 
3 decades in Hawaii and California. Once 
established, the grasses help fuel blazes that 
kill and suppress less fire-tolerant native 
plants, opening up new territory for the in- 
vaders to colonize—catalyzing yet more fire. 
In a short time, land that was once shrub- 
land, savanna, or dry forest is locked into 
being a grassland. “It’s that trigger of grass 
and fire that sets the system off in some un- 
desirable direction,” D’Antonio says. 

These grass invasions are now threat- 
ening native plants and the animals that 
rely on them, reshuffling nutrients in the 
soil and the ability of ecosystems to store 
planet-warming carbon, and disrupting ef- 
forts to use fire to benefit the native flora. 
And once the invaders take hold, eco- 
logists say it’s tough to break their grip. In 
such places, the future promises to be hot, 
smoky, and full of grass. 


GRASSES AND FIRE have been intertwined 
since before humans walked the planet. Mil- 
lions of years ago in southwest Africa and 
Asia, a dramatic rise in wildfires went hand 
in hand with the emergence of vast grass- 
lands, researchers say. There and elsewhere, 
changes in weather patterns—particularly 
the emergence of a dry season—helped 
grasses spread. Select species developed a 
new way to photosynthesize that gave them 
an advantage in hotter, drier environments. 

Evidence suggests some of these grasses 
evolved to thrive with fire, says Allison 
Karp, a paleoecologist and postdoctoral re- 
searcher at Yale University. In ancient sedi- 
ments pulled from the Bay of Bengal near 
India’s eastern coast, for example, she found 
carbon isotopes trapped in ancient plant 
waxes that indicate grasses became wide- 
spread on the subcontinent about 7 million 
years ago. During the same period, accord- 
ing to the sediments, molecules tied to 
wildfires increased 10-fold, and the chemi- 


cal traces suggest grasses played an outsize 
role in fueling those blazes. 

Today, some of the most problematic in- 
vasive grasses seem built to burn. They grow 
and dry quickly, creating abundant fuel 
each year. Certain species have leaves filled 
with oily tannins—chemicals that slow the 
decay of dead leaves, making it easier for 
them to ignite. One species, molasses grass, 
is coated with a residue that enables it to 
catch fire even while green. 

Although fire can kill invasive grasses, 
they often bounce back quickly, enabling 
them to outcompete charred competitors, 
including native grasses that didn’t evolve 
with frequent, intense fires. The lack of 
woody trunks and branches means grass 
seedlings start to photosynthesize before 
trees or shrubs put out leaves. Some grasses 
resprout from rootlike stems that grow 
underground, insulated from the flames. 

Such adaptability has helped many 
grasses naturally expand their ranges. But 
in recent times, humans have accelerated 
that process by scattering grass seeds far 
from their native habitats, sometimes by 
accident and sometimes intentionally, to 
feed livestock, control erosion, and decorate 
gardens. “The [grass] invasions in the last 
100 years or so are just a radical example of 
a speeded-up process that’s been happening 
over millennia,’ says Dave Richardson, an 
ecologist and invasive plant expert at South 
Africa’s Stellenbosch University. 

Southern Africa is a disproportionate 
source of the grasses that have invaded 
other parts of the world, Richardson has 
found. Grasses evolved there to take advan- 
tage of frequent disturbances, such as fire 
and grazing by herds of wildlife, making 
them tough competitors in new habitats. 
And once they gain a roothold outside Af- 
rica, fire often follows. In the United States, 
when scientists compared fire behavior in 
areas invaded by fire-prone grasses with 
nearby uninvaded areas, they found six 
different grasses were tied to as much as a 
150% increase in fire frequency, according 
to a 2019 report in the Proceedings of the 
National Academy of Sciences. 

In northern Australia, the arrival of gamba 
grass has provided a textbook example of 
this process. In the 1980s, the Australian 
government promoted planting the Afri- 
can grass as forage for cattle. But ecologists 
soon warned of the dangers it posed to the 
nation’s tropical savannas, a mix of sparse 
grasses and eucalyptus trees that cover 
one-quarter of the continent. The original 
ecosystem evolved with frequent, low-level 
fires, including ones set by Indigenous Ab- 
original people to create open savannas that 
improved hunting and provided habitats 
for specific plants and animals. 
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But gamba grass formed dense walls of 
vegetation, reaching 4 meters tall, that trans- 
formed fire behavior. The grass burned four 
times more intensely than native vegeta- 
tion during experimental burns, researchers 
found. The flames ran so high and hot that 
they had to abandon a common gauge of fire 
behavior—measuring the highest scorched 
leaves in a tree—because even the highest 
leaves in gamba-infested sites were singed. 

In the early 2000s, several Australian 
states reversed course, restricting the use of 
gamba grass. But in many places it was too 
late. By then the grass covered more than 
15,000 square kilometers. Researchers fear 


Kerns says, “was really a game changer.” 

A year after the Corner Creek Fire, Kerns 
and a group of Oregon State University 
graduate students began to examine what 
wiregrass might mean for the region. Their 
findings were alarming. In scablands over- 
taken by Ventenata, they found oceans of 
grass that create fuel loads 50 times greater 
than in areas free of the species. Compared 
with cheatgrass, wiregrass can colonize 
cooler, higher elevation locations and take 
root in thinner soils. 

In the process it appears to be crowding 
out native plants, such as fire-intolerant 
sagebrush, that support the local wildlife. 


Invasion hot spots 


Fire-friendly grasses have invaded new habitats around the world. Five species are considered 
among the most problematic grasses, threatening to transform entire ecosystems. 


Cheatgrass 

; Cogon grass 
if ae 
Hawaii 


y 


Molasses grass 


All five grasses are 
native to either 
Eurasia or Africa. 


N 


Gamba grass 


Buffelgrass 
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it could ultimately spread through much of 
the country’s 2 million square kilometers of 
tropical savanna. 


OREGON, meanwhile, shows how a sin- 
gle invasive grass species can alter both 
rangelands and forests. In 2015, reports of 
Ventenata-driven fires reached Becky 
Kerns, an ecologist at USFS’s Pacific North- 
west Research Station. Kerns had given 
little thought to the plant, in part because 
other invasive grasses were already caus- 
ing headaches in the region. Cheatgrass, 
for example, was spreading in parts of Or- 
egon. But the harsh scablands had proved 
inhospitable to that invader. Ventenata, 
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And it threatens rare endemic plants found 
in rocky scablands, such as Spalding’s catch- 
fly, a federally protected perennial with pale 
pink, trumpet-shaped flowers. 

Unlike some invasive grasses, Ventenata 
doesn’t appear to need disturbances such 
as fire to spread in some parts of eastern 
Oregon. In study plots on native prairies, 
the grass advanced just as quickly over 
unburned land covered with native bunch- 
grass as it did through plots burned for a 
long-term experiment. In both cases, in 
10 years Ventenata spread from approxi- 
mately 10% of the test plots to nearly 60%. 

In forests, by contrast, fires do appear to 
help spread the grass, Kerns says. An ex- 


amination of a decade of fires in the region 
revealed that more severely burned areas 
were more likely to be invaded. One expla- 
nation for that pattern, researchers say, is 
that although Ventenata doesn’t fare well 
in dense shade, it can encroach on forest 
edges, where it fuels fires that clear parts 
of the overstory, letting in more sunlight. 
Computer simulations suggest that cycle 
could gradually shrink forests, Kerns says, 
like waves eroding a beach. 

That scenario was a revelation, and not a 
welcome one. Historically, Kerns says, land 
managers in the western United States have 
thought of invasive grasses as a problem for 
wide-open rangelands. But modeling sug- 
gests wiregrass could creep into many for- 
ests in eastern Oregon, as well as those in 
dry, higher mountains as far south as Ari- 
zona and New Mexico. “Ventenata has really 
challenged a lot of our notions about inva- 
sive grasses in the West,” she says. 


IN MANY LOCATIONS, scientists are just be- 
ginning to document the impact of invasive 
grasses. In Hawaii, however, D’Antonio has 
been closely observing the interlopers since 
the early 1990s. And her ringside seat has 
enabled her to see how the effects can echo 
through a landscape in complex and sur- 
prising ways. 

In Hawaii’s Volcanoes National Park, for 
example, she has tracked how a handful of 
invasive grasses—many introduced by cattle 
ranchers—have altered environments once 
rarely touched by fire. Studying places that 
first burned in 1970, D’Antonio found that 
in forests dominated by the native ‘Ohi‘a 
tree, relatively shade-tolerant beardgrass 
from South America took root first. But if 
the beardgrass later fuels a fire, she says, 
then a “second invader, molasses grass, 
pours in.” Native grasses simply can’t com- 
pete, she adds. Whereas invasive grasses 
produce large numbers of airborne seeds, 
for example, a native species called Kawelu, 
or love grass, produces just a few seeds 
that drop nearby. Compared with molasses 
grass, she says, “It’s a total wimp.” 

Molasses grass can also alter soil nutrient 
cycles in ways that ultimately benefit inva- 
sive species, D’Antonio has found. As the 
grass leaves decomposed, for instance, they 
led to a bounty of soil nitrogen that helped 
the grass. After 16 years, however, nitrogen 
returned to preinvasion levels as the nutri- 
ent leached out of the soil. Rather than pave 
the way for a return of native plants, how- 
ever, the leaching opened the door to an in- 
vasive tree known, appropriately, as the fire 
tree. That tree, in turn, again increased soil 
nitrogen levels, which revived the molasses 
grass. “The situation there gets more and 
more grim,” D’Antonio says. 
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Experiments in plots of longleaf pine in Florida suggest a combination of drought and invasive cogon grass can amplify wildfires and kill the trees. 


These new grasses could also scramble 
how much carbon is stored in vegetation 
and soils. In Hawaii, D’Antonio found that 
in unburned areas where native ‘Ohi‘a trees 
remained, the plants held twice the carbon 
of burned areas taken over by grass—even 
25 years after a fire. 

Drought could heighten the impact of the 
grass-fire cycle, suggest studies by ecologist 
Luke Flory of the University of Florida. He 
has run field experiments that simulate 
how drought affects longleaf pine forests in- 
vaded by cogon grass from Asia, which can 
grow in dense, waist-high thickets topped 
with fluffy seed heads. 

Flory’s team grew pines on small plots 
in Florida; many were infested with co- 
gon grass whereas others only held native 
plants. Some of these miniforests were cov- 
ered by canopies that limited rain to simu- 
late drought. 

After 6 years, trees in the dry plots aver- 
aged about 3 meters tall, nearly 1.5 meters 
shorter than those in wetter 
plots. The researchers then set 
fire to all of the plots. In those 
infested with cogon grass, the 
fires burned hotter and the 
flames rose higher than in 
plots without the grass. And in 
the dry, grassy plots, the com- 
bination of shorter trees and 
higher flames proved “highly 
problematic,” Flory says, with 
nearly half of the trees dy- 
ing within 1 month of being 
burned. But only 10% of trees 
in the wetter sites died, regard- 
less of whether cogon grass 
was present. 
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Flory is reluctant to sound the alarm 
too loudly, because there are still plenty of 
unknowns about how drought, fire, and co- 
gon grass will interact in the wild. But the 
experimental results so far are worrying, 
he says. “None of it’s good. ... It’s just how 
much evidence we have that it’s really bad.” 


SUCH EXPERIMENTS have helped focus atten- 
tion on a practical matter: What can land 
managers do about the tenacious invaders 
once they arrive? 

Eradicating the grasses is often not pos- 
sible once they have reached high densities, 
researchers say. But preventing them from 
completely transforming ecosystems might 
be an option in some places. In Volcanoes 
National Park, for example, park officials 
are trying to rehabilitate shrubby or forested 
habitats that have burned by replanting them 
with more fire-hardy native species such as 
mamane, which can grow to 15 meters tall. 
The mamane could restore lost habitats for 


In Arizona’s Saguaro National 
Park, workers are trying to 
curb buffelgrass, which fuels 
blazes that threaten 

Saguaro cacti. 


birds and insects, while being less vulnerable 
to grass-fueled fires, says Sierra McDaniel, 
a botanist who leads the park’s natural re- 
source program. “We are accepting these 
grasses are widespread,” she says, and instead 
focusing on “how do you live with them?” 

In Oregon, USFS officials are weighing 
difficult choices in the fight against wire- 
grass. On one hand, they want to intention- 
ally burn more forests, in order to clear out 
brush and saplings, and restore ecosystems 
that historically evolved with fire. But such 
controlled burns also risk opening new ar- 
eas to grass invasions. 

With that in mind, Kerns has obtained 
funding for an experiment that will use 
herbicides to kill Ventenata along forest 
edges before crews set controlled burns, 
to see whether that curbs later encroach- 
ment by the grass. In the scablands, Priest 
is trying a more aggressive approach: send- 
ing in heavy equipment such as bulldozers 
to help squelch fires, fearing that native 
plants will suffer if the open 
areas burn. But he worries the 
equipment could leave lasting 
marks on delicate ecosystems— 
and that the disturbance could 
ultimately benefit the grasses. 
Learning to live in a “Ventenata 
world” is challenging, Priest 
says, and he doesn’t know 
whether “I actually have the 
right answer.” 

Kerns echoes that uncer- 
tainty. Trying to cope with inva- 
sive grasses poses a conundrum, 
she says. “We kind of want to go 
back and restore things, but it’s 
not the same world.” & 
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Transparency practices at the 
FDA: A barrier to global health 


Data sharing among regulators must be “business as usual” 


By Murray M. Lumpkin’, Margaret A. Hamburg?, William B. Schultz’, Joshua M. Sharfstein* 


572 5 AUGUST 2022 » VOL 377 ISSUE 6606 


uring the COVID-19 pandemic, sci- 
entists at the US Food and Drug 
Administration (FDA) have reviewed 
large numbers of pandemic-related 
tests, medications, and vaccines. 
However, long-standing confidenti- 
ality practices have kept FDA from sharing 
many of these analyses and the data behind 
them with the regulatory agencies of other 
nations, especially those in low- and middle- 
income countries (LMICs). With FDA not 


science.org SCIENCE 


ILLUSTRATION: DAVIDE BONAZZI/SALZMANART 


sharing key information, the primary source 
of dependable COVID-19 product regula- 
tory documentation and information for re- 
source-constrained countries has been the 
World Health Organization (WHO) in coor- 
dination with leading European regulators. 
These efforts are commendable, but in many 
cases FDA’s assessments will be some of the 
most sought after and scientifically robust 
in the world—and should be shared with 
the widest possible regulatory audience. 
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FDA must demonstrate similar leadership 
and commitment to global health by re- 
forming its outdated, restrictive practices 
on information sharing. 

Even absent a pandemic, most agen- 
cies find the challenge of medical product 
regulation daunting. Thousands of medical 
products come onto the global market an- 
nually, but few agencies have the capabil- 
ity to assess these products thoroughly. The 
WHO has estimated that only one-quarter 
of its member states have agencies with 
at least a “stable, well-functioning, and 
integrated regulatory system” (J). In most 
countries, underfunded and understaffed 
agencies struggle to meet basic regulatory 
tasks. Even mature agencies find that they 
sometimes lack the resources they need to 
meet expectations. 

In 2020, a US National Academies of 
Sciences, Engineering, and Medicine com- 
mittee highlighted one solution to this chal- 
lenge: strengthening reliance-based reg- 
ulatory pathways in which agencies use 
the extensive reviews and inspections 
conducted by trusted counterparts to bet- 
ter inform their own regulatory decision- 
making (2). For such an approach to work, 
timely sharing of complete critical infor- 
mation and relevant documents between 
agencies is essential. Such access allows 
a relying agency not only to trust but to 
understand what a reference agency has 
decided, and then to use that information 
to help inform the appropriate decision for 
its population. Such reliance-based regula- 
tory decision-making is now a 21st-century 
“pest regulatory practice,’ enshrined in 
WHO guidelines (3). 

Unfortunately, FDA entered the current 
pandemic ill equipped to serve in the role 
of reference agency to LMICs because of 
its decades-old confidentiality practices, 
dating back to a world in which most US 
pharmaceutical product development, 
manufacturing, and sales were of US or 
European manufactured products. In that 
era, other regulators’ actions had negli- 
gible impact on product development, au- 
thorization, or access in the United States. 
Yet FDA’s strict prohibitions on data shar- 
ing have remained in place even as the 
global pharmaceutical ecosystem has been 
transformed. Today, products may be man- 
ufactured in part or in whole in various 
countries and shipped widely before dis- 
tribution. During that journey, the prod- 
uct will often be under the authority of a 
variety of agencies of varying capabilities. 
Sharing of critical information garnered 
along this global manufacturing and tran- 
sit train is needed for effective oversight of 
the global supply chain of all products, in- 
cluding those bound for the United States. 


Yet until 1993, FDA regulations did not 
even distinguish between release of infor- 
mation to the general public and docu- 
ment sharing with other regulators (4); as 
a result, no information considered con- 
fidential was shared. Then in 1993, FDA 
took a step toward greater international 
cooperation. The agency adopted a regula- 
tion permitting the agency to share some 
nonpublic information with foreign regu- 
lators, provided that the foreign country 
makes certain commitments to protect the 
information from disclosure (5). In practice, 
however, FDA created an extensive process 
to validate the ability of countries to main- 
tain confidentiality. 

Nearly 30 years later, the agency has 
established confidentiality arrangements 
with only a few counterpart agencies (6), fo- 
cusing on a small number of countries per- 
ceived to be able to provide the most helpful 
information in return—not on agencies in 
LMICs. Moreover, in all but one of these ar- 
rangements, FDA does not share data classi- 
fied as “trade secret,’ a term that the agency 
has applied more broadly than necessary 
to include large amounts of manufacturing 
and biopharmaceutical data (7). 

The one agreement permitting the shar- 
ing of trade secret information, with the 
European Medicines Agency (EMA), took 
25 years and several legislative interven- 
tions to operationalize (8, 9). It was made 
possible by 2012 legislation that permitted 
the sharing of unredacted reports after 
FDA verified the “demonstrated ability” of 
foreign regulators to protect trade secret in- 
formation (J0). FDA extended this approach 
to the United Kingdom’s regulatory agency 
after it left the European Union because it 
was part of the EMA agreement. FDA has 
made no other such agreements regarding 
full inspection reports. 

Separately, FDA has established a limited 
data sharing relationship with WHO, which 
exemplifies the challenges of current FDA 
practices with LMICs. FDA sends certain re- 
dacted documents to WHO, which then, us- 
ing its own scarce resources, further distills 
information into a WHO document, which 
WHO can share (17). But the heavily redacted 
documents FDA can share make it difficult 
or impossible for resource-constrained agen- 
cies to rely on FDA regulatory activities when 
making their own regulatory decisions. 

During the current pandemic, FDA 
has issued emergency use authorizations 
(EUAs) for multiple products. The only 
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public documents related to these actions 
are the authorization letters to the com- 
panies and the “fact sheets” for health 
care providers and for patients and care- 
givers. The full (unredacted) assessment 
and inspection reports, were they avail- 
able, would have been enormously valu- 
able to agencies in resource-constrained 
countries. These agencies are highly in- 
terested in what the FDA EUA decision 
means, what data support it, what dataset 
was submitted to FDA, where and how the 
product was manufactured, and why FDA 
made the decision it made. In some cases, 
FDA may have been the first or only global 
regulator inspecting a clinical trials site 
or manufacturing facility or authorizing a 
product. Without this critical information, 
agencies cannot even assure that ship- 
ments they receive are the same version 
as those authorized by FDA (and not, for 
example, from a manufacturing facility not 
assessed by FDA). 

In contrast to FDA, WHO and the 
European agencies initiated special pro- 
cedures to include assessors from several 
countries, including LMICs, in their delib- 
erations before product authorization (12). 
Such reliance-based pathways reduced the 
need for every country to conduct time- 
consuming, redundant, and often poorly 
informed assessments and inspections. It 


also created an extra benefit of faster access 
to markets in LMICs for companies seeking 
authorization in Europe, compared with the 
United States. 

The success of this WHO and Europe- 
led system of critical information sharing 
is exemplified by the rapidity of African 
COVID-19 vaccine authorizations. As of 
March 2022, in the 54 African countries, 
there were 277 initial authorizations of 
COVID-19 vaccines, which had received 
emergency use listing by WHO. In 52 coun- 
tries, multiple such vaccines have been 
authorized (13). These COVID-19 vaccine 
authorizations were facilitated by using a 
reliance-based pathway endorsed by the 
African Union and WHO (/4). The vast 
majority are based on EMA’s authorization 
decisions because EMA and WHO worked 
closely together on the products’ assess- 
ments and because both EMA and WHO 
shared—in real time—helpful critical sci- 
entific assessment and inspections docu- 
ments with the African agencies. 

Meanwhile, because of its confidential- 
ity practices, FDA was only minimally in- 
volved in helpful information sharing with 
resource-constrained agencies. Had FDA 
data and analysis been more widely avail- 
able, many countries could have used them 
to make rapid, informed decisions for 
their populations. 


Three steps FDA should take for greater global data-sharing 


1. Share copies of assessment and inspections reports for all pandemic-related products. 
As ashort-term measure during this declared state of emergency, FDA should waive its 
current practices so that it can share, upon request, with WHO and all other counterpart 
agencies, whether products have been authorized or not. For the many countries without a 
confidentiality agreement, FDA should move quickly to accept their statements of author- 


ity and commitment for confidentiality. The agency should limit redactions to personally 
identifiable information and trade secrets as narrowly defined in the statute, which would 
permit the sharing of much helpful biopharmaceutical and manufacturing data. 


This step can still help with the COVID-19 pandemic response. There are new products and 
new versions of older products to be assessed; there are pressing questions concerning 
childhood vaccination, boosters, and variants to be answered; there are changes in manu- 
facturing procedures to be validated to enable greater and more cost-effective product 
production; and there are new efficacy and safety data to be analyzed. 


2. Develop new practices on data-sharing with WHO and counterpart agencies outside public 
health emergencies. This should include a rapid path to accepting assurances from foreign 
counterparts regarding commercial confidential and trade secret information. These new 
practices should facilitate reliance-based pathways by counterpart agencies and recognize 
that these agencies should not be considered part of the “general public” for purposes 
of confidentiality. FDA should also consider additional reforms to make sharing of fully 
unredacted reports, especially inspection reports, less onerous. 


3. Publish annual reports of its data-sharing activities with counterpart regulators. This should 
include both numbers of requests received and from whom and numbers of requests fulfilled 
and with whom. These reports should document the consequences both for global health 
and for oversight of products headed to the US market. They should also assess emerging 
challenges to data sharing, along with potential solutions. 
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Successful US leadership in the modern 
global pharmaceutical ecosystem, as well 
as global health and safety, depends on the 
ability of regulators to trust each other and 
work together. Greater FDA global data- 
sharing engagement starts with three steps 
(see the box). Progress will require more 
resources, capacity building, and will. In 
the 21st century, sharing data and critical 
documents among regulators cannot be an 
afterthought: It must be “business as usual,” 
including for FDA. 


REFERENCES AND NOTES 


1. WHO, “List of National Regulatory Authorities (NRAs) 
operating at maturity level 3 (ML3) and maturity level 4 
(ML4)" (WHO, 2022); https:/Awww.who.int/initiatives/ 
who-listed-authority-reg-authorities/maturity-level. _ 

2. USNationalAcademies of Science, Engineering, and 
Medicine, “Mutual recognition agreements in the regu- 

ation of medicines” (2019); https:/Avww.nationalacad- 

emies.org/our-work/mutual-recognition-agreements-_ 
in-the-regulation-ofmedicines.  ss—<“i‘s~™S™*™*~*~™SW 

3. WHO,“TRS1033 - 55th report of the WHO Expert 

Committee on Specifications for Pharmaceutical 

Preparations” (WHO, 2021); https:/Avww.who.int/ 

publications/i/item/55th-report-of-the-who-expert- 

committee-on-specifications-for-pharmaceutical- 
preparations. =s—i“‘(<(3l);é‘(sS~*;‘ i‘ <;273OW!*~‘;é 

4. Public information; Communications with Foreign 

Government Officials. Fed. Regist. 58 (223), 61598 ff (19 

ovember 1993). 

21 Code of Federal Regulations 20.89. 

FDA, Confidentiality commitments; 

https:/Awww.fda.gov/international- 

programs/international-arrangements/ 
confidentiality-commitments. 

7. Trade secret is defined as “any commercially valuable 

plan, formula [that has] a direct relationship to the 

production process,” with examples including “the type 
or brand of equipment used in manufacturing, product 
ormulas, product components or ingredients not on the 
abel, specifications that are unique (i.e., notin the US 

Pharmacopeia), and technical designs” (8). 

FDA, FDA Staff Manual Guide (SMG 2830.3). 

FDA, Mutual Recognition Agreement 

(MRA); https:/Avww.fda.gov/international- 

programs/international-arrangements7 

mutualrecognition-agreement-mra. 

10. The Foodand Drug Administration Safety and 

nnovation Act of 2012, Pub. L. No. 112-144. 

ll. FDA,“FDAIn Brief: FDA announces pilot program with 

World Health Organization to expedite review of HIV 

drug applications” (FDA, 2018); https:/Awww.fda.gov/ 

news-events/fda-brief/fda-brief-tda-announces-pilot- 
program-world-health-organization-expedite-review- 
fiv-drugs-~—St—<“i‘S;73ZTC<CS;7S;S SORTt*t*tC<CS 

12. EMA, Questions and answers on the pilot proj- 

ect ‘OPEN’ Opening our Procedures at EMA 

‘0 Non-EU authorities (EMA, 2021); https:// 


www.ema.europa.eu/en/documents/other/ 


questions-answers-pilot-project-open_en.pdf. 
13. WHO, AVAREF COVID-19 Africa vaccine dashboard; 


https://www.afro.who.int/health-topics/immunization/ 
avaret/covid-I9-africa-vaccine-dashboard. 

14. Africa Centers for Disease Control and Prevention 
(Africa CDC), “Guidance on emergency expedited regu- 
atory authorisation and access to COVID-19 vaccines 

in Africa” (Africa CDC, 2021); https://africacdc.org/ 
download/guidance-on-emergency-expedited- 
regulatory-authorisation-and-access-to-covid-19- 
VaccineS-in-alrica.sts—=—“i‘—S:~‘<‘<;73}HhC<CS*é*~*S 


on 


sO 90 


ACKNOWLEDGMENTS 


M.M.L. is a former deputy commissioner for international and 
special programs of FDA.M.A.H. is a former commissioner of 
FDA.W.B.S. is a former deputy commissioner for policy of FDA 
and former general counsel of the US Department of Health 
and Human Services. J.M.S. is a former principal deputy 
commissioner of FDA. 


10.1126/science.abq4981 


science.org SCIENCE 


PHOTO: ANGELINA M. BILATE 


IMMUNOLOGY 


Twice the tolerance 


A gut microbiota-derived antigen elicits distinct subsets 
of regulatory T cells to suppress inflammation in mice 


By Meera K. Shenoy! and Meghan A. Koch’? 


ammalian survival depends on 

the ability of the immune system 

to mount sterilizing protection 

against pathogens while regulating 
(suppressing) responses toward 
innocuous antigens such as from 

food or commensal microbes. Within the 
intestine, two key helper T cell populations, 
CD4* peripheral regulatory T cells (PT, 
cells) and CD4*CD8aa‘* intraepithelial lym- 
phocytes (CD4IELs), are largely responsible 
for restraining aberrant inflammatory re- 
sponses against self and innocuous foreign 
antigens (J). Although progress has been 
made in understanding the mechanisms re- 
sponsible for gut ae cell development and 
function, far less is known about the speci- 
ficity, differentiation, and role of CD4IELs. 
On page 660 of this issue, Bousbaine et al. 
(2) reveal that B-hexosaminadase (B-hex), 
a conserved antigen expressed by several 
gut commensal bacterial species, drives 
CD4IEL differentiation. These §-hex-elic- 
ited CD4IELs cooperate with pT... cells to 
limit pathology in a mouse model of colitis. 
IELs are a tissue-resident immune cell 
population that resides at the basolateral 
surface of the intestinal epithelial cell bar- 
rier and helps to maintain tissue homeosta- 
sis within the intestines. Although there are 
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many types of IELs, CD4IELs are specifi- 
cally “induced” from conventional CD4* T 
cells in the periphery after antigen exposure 
and tissue cues in gut-associated second- 
ary lymphoid organs. These cells then up- 
regulate CD8aa expression and traffic to 
the intestinal epithelial barrier, where they 
can exert both regulatory and cytotoxic 
functions to reinforce gut health (3, 4). 

Recent studies aimed at defining the an- 
tigens and tissue signals that drive CD4TEL 
differentiation have revealed a _ funda- 
mental role for the gut microbiota. Germ- 
free mice almost entirely lack CD4IELs; 
however, upon microbial reconstitution, 
CD4IELs differentiate and resemble con- 
ventional mice in number and phenotype 
(4). Additionally, conventional mice treated 
with broad-spectrum antibiotics experi- 
ence a precipitous drop in CD4IELs, which 
rebound as the microbiota reestablishes 
itself after antibiotic cessation (5). Thus, 
the microbiota is required for both the dif- 
ferentiation and maintenance of CD4IELs, 
yet the mechanisms underlying this process 
remain unclear. 

Lactobacillus reuteri, a common mem- 
ber of the mouse intestinal microbiota, 
metabolizes tryptophan to produce indole 
derivatives. Binding of these derivatives 
to the aryl hydrocarbon receptor (AHR) 
on CD4* T cells was sufficient to down- 
regulate expression of the transcription 
factor Thpok, thus driving CD4IEL dif- 
ferentiation. Other studies correlated the 
abundance of specific commensal species 


Regulatory T cells induced by B-hexosaminidase from 
Bacteroidetes species in the mouse gut microbiota 
can suppress colitis, as shown in this hematoxylin and 
eosin-stained section of a mouse colon. 


such as L. reuterit or Faecalibacterium 
prausnitzii with CD4IEL counts, yet it was 
unknown whether commensal-derived an- 
tigens could directly trigger CD4IEL differ- 
entiation or function (5, 6). 

Bousbaine et al. identified B-hex derived 
from the commensal Parabacteroides gold- 
steinii as an antigen that is recognized by 
CD4* T cells and triggers their differentia- 
tion into CD4IELs and PT... cells in mice. 
They found that ®-hex-specific CD4* ae 
cells and CD4IELs can be readily found in 
mice colonized with a complex microbiota, 
and the P. goldsteinti B-hex epitope is evolu- 
tionarily conserved across several species of 
the Bacteroidetes phylum. Additionally, B- 
hex from these species stimulated increased 
frequencies of B-hex-specific CD4IELs, 
whereas B-hex from species with different 
epitopes failed to do the same. Thus, the 
findings of Bousbaine et al. indicate that 
conserved bacterial epitopes derived from 
abundant, yet distinct, commensal bacteria 
support the expansion of CD4IELs. 

How do B-hex-elicited CD4IELs par- 
ticipate in intestinal homeostasis? 
Transcriptional analyses of mouse IELs 
and correlative studies of CD4IELs from 
patients with HIV or inflammatory bowel 
disease (IBD) suggested that CD4IELs have 
a regulatory role in maintaining tissue ho- 
meostasis and limiting pathology (4, 7-9). 
Using gene expression analysis, the authors 
confirmed that (-hex-specific CD4IELs 
were transcriptionally similar to the total 
CD4IEL pool. Using an adoptive trans- 
fer model of colitis in which naive CD4* 
T cells trigger severe intestinal inflam- 
mation when transferred into immune- 
deficient recipient animals, Bousbaine 
et al. demonstrated that the addition of 
commensal-specific CD4* T cells (which 
differentiate into PT, cell and CD4IEL 
populations) could fully rescue colitis pa- 
thology and morbidity. Moreover, when 
mice were given CD4* T cells that could 
only differentiate into CD4IELs, half were 
protected from colitis. These key findings 
demonstrate that CD4IELs elicited by a 
conserved commensal antigen can restrain 
harmful inflammatory responses in the gut 
(see the figure). 

Ultimately, the study of Bousbaine ez al. 
addresses an important outstanding ques- 
tion in mucosal immunology regarding how 
the microbiota contributes to the develop- 
ment and function of CD4IELs. One of the 
most fascinating findings is that a single 
commensal antigen can drive the differ- 
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A bacterial antigen promotes intestinal homeostasis in mice 
Aconserved epitope in the Bacteroidetes enzyme B-hexosaminidase (8-hex) is presented by antigen-presenting 
cells (APCs) to CD4* cells to drive their differentiation to CD4*CD8aa‘* intraepithelial lymphocytes (CD4IELs) in 
the intestinal tissue and forkhead box P3* (FOXP3*) peripheral regulatory T cells (pT,,, Cells) in gut-associated 
lymphoid organs. These two T,,, Cell populations suppress intestinal inflammation and promote homeostasis. 


CDAIEL Inflammatory 


lymphocyte 


FOXP3* 
PTreg Cell 


entiation of two distinct types of oe cell: 
CD8aa* CD4IELs in the intestinal tissue 
and forkhead box P3* (FOXP3*) pT,,. cells 
in the gut-associated lymphoid organs. 
This finding suggests that there may be 
numerous factors, both antigen-dependent 
and -independent, that determine how 
Te cells differentiate and how they be- 
have in situ. How f-hex elicits two differ- 
ent regulatory cell populations remains to 
be revealed. Perhaps the context of antigen 
presentation (such as location or antigen- 
presenting cell) dictates ®-hex-specific T 
cell fate. Intriguingly, the microbial antigen 
in question is conserved across numerous 
bacterial species. It will be interesting to 
dissect whether induction of Ts cells con- 
fers an advantage to the microbial species 
that express B-hex. Another open question 
concerns the specificity of other mouse 
CD4IELs, as well as CD4IELs in humans: 
Do they also recognize commensal-derived 
ligands? Relatedly, if other CD4IELs are in- 
deed specific for the microbiota, it is pos- 
sible that they may have “sister” pT... cells 
with the same specificity. 

This work also provides new insight re- 
garding the therapeutic potential of the 
microbiota. For example, treatment with 


576 5 AUGUST 2022 + VOL 377 ISSUE 6606 


Mesenteric lymph node 


reactive 
Treg Cells 


“ie 


B-hex or other commensal antigen(s) rec- 
ognized by CD4IELs may mitigate intesti- 
nal inflammation in affected individuals. 
Notably, only half of the animals were pro- 
tected from colitis when antigen-specific 
CD4* T cells were precluded from becom- 
ing FOXP3* PT... cells and forced toward 
CD8aa* CD4IEL differentiation, highlight- 
ing the complexity of immune regulation in 
the intestine. The discoveries and new tools 
reported by Bousbaine et al. lay the ground- 
work for testing whether other commensal 
antigens can elicit CD4IELs, delineating the 
pathways involved in their differentiation, 
and manipulating this process to promote 
intestinal homeostasis. 
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TIME CRYSTAL 


Unleashing 
spontaneity in 
a time crystal 


Ordered patterns reoccur 
over time in an ultracold 
atomic gas trapped in light 


By Lindsay J. LeBlanc 


he quest to understand how systems 

develop order underlies the funda- 

mental understanding of almost all 

sciences, from the origin of galaxies 

to the origin of life. Over the past de- 

cade, ideas about temporal crystal- 
lization have been explored theoretically 
and experimentally, probing the question of 
whether systems can develop regular, repeat- 
ing behavior through time, in the same way 
that spatial crystals regularly cycle through 
different configurations. These so-called 
“time crystal” behaviors have been observed 
before (1-6), with their periodic properties 
usually driven by periodic external forces. On 
page 670 of this issue, Kongkhambut et al. 
(7) demonstrate a time crystal that emerges 
from a time-independent initial state without 
the application of a recurring external force, 
analogous to the formation of spatial crystals. 
The study provides insight into the nature of 
dynamical phases in quantum systems and 
how order can manifest in the presence of 
dissipative processes. 

To help understand the abstract concept 
of temporal order, it is necessary to first es- 
tablish what “order” means in space. One 
important concept to consider when it comes 
to understanding spatial order is symmetry. 
For example, a liquid has perfect continuous 
translational symmetry because its proper- 
ties are identical at every point in space. 
However, once the liquid is turned into a 
crystalline solid—for example, from water to 
ice—this continuous translational symmetry 
is lost, and the properties of this solid are the 
same only when the location of the observer 
is shifted by multiples of the unit cell length 
of the crystal. There is no predetermined lo- 
cation at which a specific atom should find it- 
self upon crystallization; an anchoring lattice 
site is spontaneously and randomly selected 
as the crystal freezes in place. This sponta- 
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neity in selecting a starting point for the 
emergent order is a defining characteristic in 
phase transitions. 

In the spirit of understanding systems 
through their symmetries, physicists have 
wondered whether the same concept exists 
in time (8, 9)—whether there could exist 
“time crystals,’ physical systems with stable, 
periodic behaviors that break time-transla- 
tion symmetry. This notion is at odds with 
conventional quantum mechanics, which 
presumes that the natural state of a system 
does not depend on time. Although studies 
have shown that time crystals are not pos- 
sible in most “natural” situations, they are 
possible for systems that are driven and/or 
dissipative—when energy is added to or lost 
by the system. These “time crystallization” 
behaviors have been demonstrated, for ex- 
ample, in trapped ions (J), nitrogen vacancy 
centers (2), nuclear magnetic resonance (3), 
superfluid helium (4), cold atoms (5), and su- 
perconducting qubits (6). All of these experi- 
ments were performed in periodically driven 
systems, in which the periodicity emerged at 
twice the period of the driving force, chang- 
ing the original symmetry to a different sym- 
metry. Although these pioneering results 
lend insight into emergent behavior in driven 
systems, the timing of the recurrent behavior 
is a simple multiple of the external driver's 
timing. Because the temporal pattern relies 
on an external driver, this raises the question 
of whether the nomenclature “time crystal- 
lization” is really the best way to describe this 
phenomenon. 

Kongkhambut et al. demonstrate a kind of 
time crystallization that breaks continuous 
time translation symmetry without the need 
of a recurring external force. Using a Bose- 
Einstein condensate of ultracold rubidium 
atoms inside an optical cavity, the resonance 
frequency of the cavity was engineered to be 
about the same as that of the atoms, while 
a finely tuned standing-wave laser was used 
as a pump to interact with the atoms (see 
the figure). Through a phenomenon known 
as Dicke superradiance, the atoms absorb 
the photons from the laser and then reemit 
the light into the cavity (0, 11). In turn, the 
light emitted into the cavity is reabsorbed 
by the atoms again and mediates an effec- 
tive interaction across all atoms in the cavity, 
repeatedly. This phenomenon prompts the 
emergence of a spatial order among the at- 
oms, breaking the continuous spatial transla- 
tion symmetry of the atoms—a Bose-Einstein 
condensate—into a two-dimensional periodic 
lattice and creating a spatial crystal. 

Here, Kongkhambut et al. make a small 
but crucial addition to the setup: by adding 
a “blue-detuned” light (12), whose energy 
is higher relative to that of the atoms. This 
change makes two possible structural config- 
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urations possible, in which both would com- 
pete to lower the overall energy of the system. 
Bathing in the blue light, the atoms prefer to 
remain in the low-intensity regions of the la- 
ser field. This reduces the probability of them 
emitting light into the cavity, which then 
lowers the strength of the standing-wave po- 
tential that keeps them in the first configu- 
ration. Free from this trap, the atoms would 
then move into regions where they are more 
likely to emit light into the cavity in the sec- 
ond configuration. However, in this location, 
the atoms begin to interact more strongly 
with the pump light, which reestablishes the 
standing-wave lattice and shifts atoms back 
to their original positions, and a recurring 
cycle is born. 


As a first demonstration, the set of condi- 
tions over which this phenomenon arises is 
narrow because of the delicate balance be- 
tween energy scales. And like the periodically 
driven systems, there is a native time scale 
built into this system in the round-trip tim- 
ing of light reflecting between the mirrors 
of the cavity. Moving forward, the limits of 
these systems are sure to be pushed to better 
understand the nature of dynamical phases, 
such as exploring the transitions between 
normal and crystalline states (13), and in 
adding experimental probes such as quan- 
tum gas microscopy to gain insight into the 
correspondence between spatial and tempo- 
ral order. The study of dynamical many-body 
quantum systems remains an important area 


A shape-shifting lattice of atoms 
Atoms inside a cavity choose their locations according to the lowest energy configuration they can find. 
Kongkhambut et al. designed conditions in which two competing configurations are equally preferred and the 
system spontaneously begins oscillations between them, creating a “time crystal.” 


e— Light from a “pump” laser interacts with the 
atoms, which absorb and reemit some of the 
light in the direction of the other light beam. 


— Mirrors —9 


> 


Light in the cavity mode interacts with the 
atoms while providing a background 
landscape of hills and valleys where the 
atoms can nest. 


In regions of high intensity (the “hills”), the 

atoms emit more light into the cavity mode 

than they absorb, making the configuration 
energetically unfavorable over time. 


The final ingredient that makes this a time 
crystal is a short delay between the atoms’ 
shift in position and the light being emit- 
ted to shift them back, which determines 
the rhythm of the process. A key feature in 
this system that allows this behavior to be 
observed is that a small portion of the light 
leaks out of the cavity during each cycle and 
varies in amplitude with the same time de- 
pendence as that of the atom positions in the 
cavity. This makes possible the direct mea- 
surement of the spontaneity of the symmetry 
breaking, revealed as the randomly distrib- 
uted phases of oscillation measured over re- 
peated instances of this experiment. Together 
with an additional check of the rigidity of this 
phase in the presence of noise, Kongkhambut 
et al. lay claim to the discovery of a continu- 
ous time crystal—one that spontaneously 
begins its recurring behavior at any point in 
time, in contrast to discrete time crystals that 
begin oscillating only at specific, predeter- 
mined times. 


After moving to the “valleys” with low laser 
intensity, emission from the atoms into the cavity 
is reduced, and the atoms eventually return to 
the hills where the interaction energy is lower. 


of investigation whose outcomes may range 
from the practical applications of time crys- 
tallization in precision metrology to answer- 
ing questions about the fundamental nature 
of time in quantum mechanics. 
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REGENERATION 


A regenerative niche for stem cells 


Production of hyaluronic acid allows regenerative signaling in muscle stem cells after injury 


By Davide Gabellini 


keletal muscle is the largest tis- 
sue in the human body and has a 
substantial capacity to recover its 
structure and function after acute 
damage. This regenerative ability is 
due to muscle stem cells (MuSCs), 
which are normally in a dormant state 
called quiescence (7). After injury, signals 
from dying muscle fibers and various tis- 
sue-resident and infiltrating cell types, in- 
cluding inflammatory cells, lead to MuSC 
activation and promotion of their muscle 
regeneration activity. Although integra- 
tion of the different signals present in the 
injury microenvironment (niche) is key to 
muscle repair and reconstitu- 
tion of the quiescent stem cell 
pool (2), how MuSCs adapt to 


the gene hyaluronan synthase 2 (Has2), 
which encodes a key enzyme in HA biosyn- 
thesis, is important for MuSC activation. 
Accordingly, JMJD3 ablation in MuSCs in 
mice results in a reduction of HA deposi- 
tion in the injured niche. Notably, MuSC 
activation and muscle regeneration defects 
observed upon JMJD3 MuSC-specific de- 
letion can be phenocopied by treatment 
of wild-type mice with the HA synthesis 
inhibitor 4-methylumbelliferone. Also, 
treatment with HA can partially rescue 
defects from JMJD3 MuSC-specific dele- 
tion. Collectively, these data indicate that 
the JMJD3-HAS2-HA pathway plays a key 
role in muscle repair (see the figure). In 
the future it will be important to deter- 


(6). Upon muscle injury, MuSCs stimulate 
tissue-resident macrophages to attract cir- 
culating inflammatory cells. These cells are 
initially needed to remove dead cells and 
make room for new muscle (6). They also 
produce a number of cytokines, including 
interferon-y (IFN-y) and interleukin-6 (IL- 
6), which exert muscle proregenerative ef- 
fects by stimulating MuSC proliferation (7, 
8). Notably, Nakka et al. demonstrated that 
IFN-y and IL-6 hinder MuSC activation and 
that HA is needed to overcome this inhibi- 
tion, allowing MuSCs to be activated and 
contribute to muscle regeneration. 

Nakka et al. also showed that HA is 
not intrinsically required for MuSCs to 
exit quiescence. HA is needed only when 
MuSCs are exposed to an in- 
flammatory microenvironment. 
Indeed, MuSCs that are unable 


the altered niche is not com- 
pletely known. On page 666 
of this issue, Nakka et al. (3) 
show that MuSCs activate the 
production of the extracellu- 
lar matrix protein hyaluronic 


Hyaluronic acid protects muscle stem cells 

In the absence of muscle damage, muscle stem cells (MuSCs) are in a dormant, 
quiescent state. After muscle injury, MuSCs up-regulate the expression of Jumonji 
domain-containing protein 3 (JMJD3), which derepresses the hyaluronan 
synthase 2 (Has2) promoter; the resulting HAS2 protein increases hyaluronic acid 
production. This extracellular matrix protein protects 
MuSCs from inhibition by inflammatory cytokines 


to produce HA display an ap- 
parently intact activation in 
response to injury in the ab- 
sence of inflammation. This 
raises the exciting possibility, 
deserving of future investiga- 


acid (HA), which is required to 
overcome inhibitory inflamma- 
tion signaling from the injury 
niche, exit quiescence, and ini- 
tiate muscle repair. 

Cell fate transitions induced 
by muscle damage are associ- 
ated with important changes 
in the epigenetic landscape 


Myofiber 


secreted by macrophages, allowing MuSC 
proliferation and tissue repair. 


Activated MuSC 0) 
e 


Macrophage —* pe 


of MuSCs (4). Nakka e¢ al. in- 
vestigated the role of Jumonji 
domain-containing protein 3 
(JMJD3) and _ ubiquitously 
transcribed TPR protein on the X chro- 
mosome (UTX), two enzymes responsible 
for removal of the inhibitory modifica- 
tion trimethylated histone H3 lysine 27. 
By generating mice conditionally ablated 
for either enzyme in MuSCs, they found 
that the two proteins have distinct roles 
in muscle regeneration. JMJD3 is mainly 
involved in the early steps of MuSC acti- 
vation, whereas UTX plays an important 
role in subsequent MuSC proliferation 
and differentiation. Among the specific 
JMJD3 targets, the authors identified that 
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se Quiescent MuSC 


tion, that HA is not needed just 
to shield MuSCs from the nega- 
tive influence of inflammatory 
cytokines but that HA and its 
receptors could be signals in 


Muscle 


‘i og HAS2 cist aoe the injury niche that promote 


regeneration. 
Later during the muscle re- 
generation process, inflamma- 


mine the upstream signaling pathway(s) 
that are responsible for JMJD3 activation 
upon muscle injury. Also, understanding 
whether other stem cells use the same mo- 
lecular switch to adapt to inflammatory 
conditions and allow tissue repair could 
help identify potential therapeutic targets. 

HA is not only a major component of 
the extracellular matrix, which maintains 
tissue structure and rigidity, but is also 
important for regulating cell-cell interac- 
tions and modulating the local concentra- 
tion of soluble factors in the microenviron- 
ment (5). This is particularly relevant in the 
context of muscle regeneration, in which 
MuSCs and various inflammatory cells re- 
ciprocally coordinate the repair process 


= tory cells modify their gene 
expression and metabolic pro- 
grams and stimulate MuSC 
fusion and differentiation to 
build new muscle tissue. These processes 
are required for resolution of inflamma- 
tion, tissue restoration, and the return to 
homeostasis (6). It would be interesting 
to determine whether HA produced by 
MuSCs contributes to integrating signals 
and regulating cell-cell interactions during 
this phase of muscle regeneration. 
Previous in vitro studies found that, 
while promoting muscle cell proliferation, 
HA inhibits their differentiation (9). Nakka 
et al. showed that HA levels are very low in 
the quiescent MuSC niche and are dynami- 
cally regulated during regeneration, peak- 
ing at the time of MuSC activation. Besides 
synthesis by hyaluronan synthases, HA lev- 
els are controlled by degradation, which is 
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carried out by several hyaluronidases that 
are expressed in a tissue-specific manner 
(10). Whether HA degradation contributes 
to MuSC return to quiescence upon muscle 
damage repair is a relevant aspect to be in- 
vestigated in the future. 

Aberrant HA accumulation has been re- 
ported in cellular models of facioscapulo- 
humeral muscular dystrophy (FSHD), and 
preventing HA buildup reduces pathologi- 
cal symptoms (/1). Hence, it would be in- 
teresting to investigate whether increased 
or persistent HA production contributes to 
FSHD by interfering with MuSC homeo- 
stasis and repair. Another condition asso- 
ciated with altered HA levels is aging, in 
which HA in skin connective tissue and 
in synovial joint fluid is reduced. For this 
reason, HA creams and injections are used 
to reduce wrinkles or for arthritis treat- 
ments. Whether an altered HA production 
by MuSCs contributes to the abnormal 
deposition of extracellular matrix compo- 
nents and the reduced muscle tissue regen- 
erative potential in aging (J2) remains to 


“Muscle stem cells activate 
the production of...hyaluronic 
acid...to overcome inhibitory 
inflammation signaling... 
and initiate muscle repair.” 


be investigated. Overall, the discovery of 
the JMJD3-HAS2-HA pathway and its cen- 
tral role in MuSC activation upon muscle 
injury by Nakka et al. opens many new 
avenues for both understanding damage 
responses and developing therapies for 
muscle diseases. 
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METABOLISM 


Mitochondria rescue cells 
from ischemic injury 


Activation of a G protein-coupled receptor prevents 
cardiomyocyte death during ischemia 


By Susana Cadenas!? 


schemic diseases, such as myocardial 
infarction and stroke, are major causes 
of death worldwide. To date, therapies 
aimed at preventing ischemic damage or 
restoring functional tissue have not been 
successful. Kynurenic acid (KynA), a me- 
tabolite of tryptophan metabolism, is tissue 
protective in cardiac, cerebral, renal, and reti- 
nal models of ischemia, but the mechanism 
of this protection was unknown. On page 621 
of this issue, Wyant et al. (2) report that KynA 
binds G protein-coupled receptor 35 (GPR35) 
to activate downstream signaling and, once 
internalized, indirectly interacts with adeno- 
sine triphosphate (ATP) synthase inhibitory 
factor 1 (IF1) to block ATP depletion during 
ischemia by promoting the dimerization and 
inactivation of ATP synthase. This cardiopro- 
tective mechanism could be further explored 
for preventing or treating ischemic injury. 

GPR35 expression is up-regulated in the 
myocardium in human heart failure (2), and 
its expression increases during hypoxia un- 
der the control of hypoxia-inducible factor 1 
(HIF-1) (3), which directs multiple transcrip- 
tional programs in response to reduced oxy- 
gen availability. Studies investigating the tis- 
sue-protective role of GPR35 during ischemia 
have yielded conflicting results. Using a vari- 
ety of approaches, Wyant et al. provide com- 
pelling evidence that, by activating GPR35, 
KynA is cardioprotective against ischemia- 
reperfusion (IR) injury. They show that KynA 
is unable to prevent IR injury in mice lacking 
GPR35 and in human induced pluripotent 
stem cell-derived cardiomyocytes expressing 
a mutant version of the receptor. The cardio- 
protective effect of KynA is mimicked by sev- 
eral GPR35 agonists in mice. 

HIF transcription factors are heterodi- 
mers consisting of an oxygen-labile a sub- 
unit that is stabilized under hypoxia and a 
constitutively expressed 8 subunit. Oxygen- 
dependent hydroxylation of HIFa by HIF 
prolyl hydroxylases (PHDs), which are 
a-ketoglutarate (aKG)-dependent  dioxy- 
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genases, promotes its ubiquitination and 
proteasomal degradation (4). HIF activation 
improves ischemia-related outcomes through 
several mechanisms, the main one being the 
restoration of oxygen homeostasis mediated 
by reprogramming metabolism and inducing 
angiogenesis (blood vessel formation). HIF is 
also important in ischemic preconditioning 
(5), a phenomenon whereby brief periods of 
nonlethal ischemia protect against a subse- 
quent prolonged ischemic insult; inhibiting 
the HIF pathway impairs different types of 
ischemic preconditioning (6, 7). 

A clinically practical approach to improv- 
ing the outcomes of ischemic disease is re- 
mote ischemic preconditioning (RIPC), in 
which transient ischemia to a nonvital organ 
protects distant organs. The molecular me- 
diators of RIPC have been sought for many 
years in the hope that they might be useful 
to treat cardiovascular disease. The study of 
Wyant et al. expands on the authors’ previous 
work showing that KynA affords cardiopro- 
tection in a mouse model of RIPC (8). They 
showed that inhibition of PHD2 leads to 
HIF activation, which protects mice against 
cardiac IR injury. Systemic PHD2 loss or 
specifically that in skeletal muscle increases 
circulating aKG, driving hepatic production 
and secretion of KynA, which is necessary 
and sufficient to mediate cardiac ischemic 
protection. 

More than 95% of the ATP in cardiomyo- 
cytes is produced by the inner mitochondrial 
membrane ATP synthase during oxidative 
phosphorylation. In normally respiring mito- 
chondria, ATP synthase produces ATP from 
adenosine diphosphate (ADP) and_ phos- 
phate using the proton motive force (PMF) 
generated by electron transfer along the re- 
spiratory chain. During ischemia, the PMF 
is dissipated, and ATP synthase reverses di- 
rection and hydrolyzes ATP, contributing to 
cell death (9). This hydrolytic activity is lim- 
ited by IFI, thus conserving cellular energy. 
Wyant et al. propose that, once activated, 
GPR35 translocates to the outer mitochon- 
drial membrane, where it indirectly interacts 
with IF1. Internalized GPR35 inhibits mito- 
chondrial adenylate cyclase [and therefore 
cyclic adenosine monophosphate (cAMP) 
production] and consequently protein kinase 
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A (PKA), which favors dephosphorylated IF1, 
the active form of the inhibitor (JO). They 
show that IF1 promotes ATP synthase di- 
merization and prevents ATP loss during is- 
chemia (see the figure). 

The binding of IF1 to ATP synthase also 
affects the mitochondrial permeability tran- 
sition pore (mPTP), a high-conductance non- 
specific channel that is formed in the inner 
mitochondrial membrane under oxidative 
stress, leading to depolarization and ces- 
sation of ATP synthesis, with catastrophic 
consequences for energy conservation. The 
mPTP contains ATP synthase dimers (JJ), 
and IF1 binding may prevent mPTP opening 


cumulation of KynA and thus have cardiopro- 
tective potential. Similarly, preclinical studies 
with IF1 mimetics have generated promising 
results in models of cardiac ischemia (9). 
GPR35 activation by KynA stimulates lipid 
metabolism and thermogenic and anti-inflam- 
matory gene expression, increasing energy ex- 
penditure in adipose tissue (15). Accordingly, 
can KynA induce a transcriptional program 
that affects cardiomyocyte metabolism? 
Elucidating the pathways orchestrating is- 
chemic protection and the development of 
therapeutics that modulate them has impor- 
tant clinical implications for a patient popula- 
tion with historically poor outcomes. 


Protecting against cardiac ischemia 


KynA OQ 2 


Hypoxia 
response 


ADP, adenosine diphosphate; ATP, adenosine triphosphate; cAMP, cyclic adenosine monophosphate; FIH, factor inhibiting HIF; 


Upon ischemia, G protein-coupled receptor 35 (GPR35) expression is induced by hypoxia-inducible factor 
(HIF), which then binds kynurenic acid (KynA). GPR35 is internalized to the outer mitochondrial membrane, 
where it inhibits mitochondrial adenylate cyclase (SAC), which in turn prevents phosphorylation (P) of 

ATP synthase inhibitory factor 1 (IF1) by protein kinase A (PKA). Dephosphorylated IF1 leads to ATP synthase 
dimerization, preventing ATP hydrolysis and promoting energy conservation. 
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in the low-pH conditions of ischemia (12). 
Indeed, low pH favors IF1 binding to the en- 
zyme (13). Additionally, IF1 is up-regulated 
in hypertrophied hearts, promoting the for- 
mation of inactive ATP synthase tetramers, 
which triggers mitochondrial reactive oxygen 
species (ROS) production, stabilizing HIF-la 
and activating a metabolic switch that stimu- 
lates glycolysis (14). It is possible that the 
inhibition of ATP synthase by IF1 during is- 
chemia also initiates ROS signaling, which 
may affect cardiomyocyte metabolism. 

The study of Wyant et al. supports further 
examination of KynA and, more broadly, 
GPR35 agonists, including synthetic com- 
pounds with greater potency than KynA, for 
preventing or treating ischemic injury. Their 
findings also provide a molecular explanation 
for the tissue-protective effects of kynurenine 
monooxygenase inhibitors, which promote ac- 
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CARBENES 


Safe, selective, 
and scalable 
carbenes 


The synthesis of reactive 
carbene intermediates is 
made simpler and safer 


By Michael S. West 
and Sophie A. L. Rousseaux 


cross chemistry, new strategies to 
access transient and reactive inter- 
mediates are of interest for further 
research and synthetic applications. 
In organic chemistry, carbenes are 
one such intermediate. These in- 
clude any neutral carbon molecule with 
two unshared valence electrons (J, 2). 
Carbenes can participate in a variety of 
reactions such as dimerization, formation 
of three-membered rings, and bond inser- 
tions. However, accessing these valuable 
intermediates has traditionally required 
the use of precursor molecules that are 
unstable or are limited to specific types of 
reactions (3, 4). For example, nonstabilized 
alkyl-substituted carbenes are highly reac- 
tive intermediates, which limits their use 
in synthetic transformations (3). On page 
649 of this issue, Zhang et al. (5) report a 
means to access both stabilized and non- 
stabilized carbenes under safe and simple 
reaction conditions. 

A common strategy to generate carbenes 
is through the use of diazoalkanes, which 
are reactive carbon-based molecules that 
contain a diazo functional group (i.e., have 
two linked nitrogen atoms) that releases 
nitrogen upon carbene formation. These 
can undergo a variety of different reac- 
tions when paired with a metal catalyst (6). 
The diazoalkane starting materials can be 
substituted with both electron-accepting 
and electron-donating groups, which can 
modulate the electronic properties—and, 
subsequently, the reactivity profile—of the 
generated carbene. In all cases, the release 
of nitrogen during the reaction leads to 
important safety concerns when handling 
these thermally unstable molecules, be- 
cause large volumes of colorless and odor- 
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less nitrogen can displace oxy- 
gen and cause suffocation to 
unsuspecting bystanders (7). Zh 
As a result, their use in indus- 
trial-scale reactions is severely 


characteristics) is the reaction 


gem-dihaloalkanes, which are 
carbon-based molecules that 
contain two halogen substitu- 
ents on the same carbon atom 
(9). Although this method is 
safer, the reactivity of the re- 
sulting zinc carbenoids is lim- 
ited to cyclopropane formation 
(i.e., cyclopropanation). The 
preparation of zinc carbenoids 
can also suffer from detrimen- 
tal side reactions if the starting material is 
an alkyl-substituted gem-dihaloalkane (10, 
11), which further limits their application. 

Owing to these limitations, there is a 
need for methods to make carbenes from 
stable precursors in a safe manner. Driven 
by their prior work in developing methods 
to access reactive intermediates (12, 13), 
Zhang et al. demonstrate the formation 
of aryl- and alkyl-substituted carbenes by 
starting with stable and scalable acetate- 
containing organohalide precursors, which 
themselves can be derived from simple al- 
dehydes (12). By pairing these carbenes 
with appropriate metal catalysts, their re- 
activity can be controlled for a variety of 
synthetic transformations (see the figure). 
This generalized approach avoids the out- 
standing safety concerns of diazoalkanes 
and the stability and selectivity limitations 
of gem-dihaloalkanes. 

Diazoalkanes can decompose near 
100°C, where the energy produced upon de- 
composition can be more than 200 kJ/mol 
(10). This thermal instability makes scale- 
up reactions with these reagents very dan- 
gerous. With aldehyde-derived carbene 
precursors, Zhang et al. observed stability 
at >300°C and that less than 30 kJ/mol of 
energy was produced upon decomposition; 
the output of nitrogen was also avoided. As 
a result, a cyclopropanation reaction us- 
ing these carbene reagents can be scaled 
up 1000-fold without incident (J0). This 
method can also be used to generate an un- 
substituted carbene precursor from form- 
aldehyde in kilogram quantities, thereby 
avoiding the use of the notoriously danger- 
ous diazomethane (1/4, 15). 

Besides addressing safety and _ stabil- 
ity concerns, selectivity and versatility in 
terms of reactivity are also important con- 
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Carbene dimerization 
allows for the synthesis 
of symmetrical alkene 
products. 


siderations in the development of carbene 
precursors. Zhang et al. show that the car- 
bene precursors are compatible in more 
than 10 different types of reactions. Using 
inexpensive base metal catalysts, such as 
copper(I) chloride (CuCl) and cobalt(II) 
chloride (CoCl,), alkene products were se- 
lectively produced through carbene dimer- 
ization. This worked well in the presence 
of different functional groups. Notably, 
nonstabilized alkyl-substituted carbenes, 
which are typically too reactive to achieve 
good yields in synthetic reactions, were 


“The straightforward synthesis, 
stability, scalability, and 
selectivity of the carbene 
precursors reported by Zhang 
et al. will provide chemists 
with a better means for 
achieving interesting carbene 
applications in synthesis.” 


also compatible under the reaction condi- 
tions with no problematic side reactions 
observed (72). Tuning carbene reactivity 
with other catalysts, such as rhodium(II) 
acetate [Rh,(OAc),] and iron(II) chloride 
(FeCl,), enabled even more chemical reac- 
tivity, resulting in cyclopropanation reac- 
tions of both electron-rich and electron- 
deficient alkenes with the use of a sulfide 
cocatalyst. Other three-membered rings 
such as epoxides and aziridines can also 
be generated from the corresponding al- 
dehydes and imines under this diazo-free, 
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catalytic strategy. In addition 
to dimerization and _ cyclo- 
propanation reactions, another 
common reaction pathway for 
carbenes is insertion into po- 
larized heteroatom-hydrogen 
bonds. Both stabilized aryl- 
and nonstabilized alkyl-car- 
benes generated from these 
precursors succeeded in this 
reactivity as well, undergoing 
efficient insertion into a wide 
variety of polarized bonds—for 
example, nitrogen-hydrogen, 
phosphorus-hydrogen, silicon- 
hydrogen, boron-hydrogen, and 
sulfur-hydrogen. 

The straightforward synthe- 
sis, stability, scalability, and 
selectivity of the carbene pre- 
cursors reported by Zhang et 
al. will provide chemists with 
a better means for achieving interesting 
carbene applications in synthesis. The 
authors have provided a straightforward 
method to generate these valuable stabi- 
lized and nonstabilized reactive intermedi- 
ates from aldehydes by eliminating long- 
standing safety and reactivity challenges. 
Academic and industrial chemists who 
may have previously been hesitant to ex- 
plore carbene chemistry because of the as- 
sociated hazards and challenging synthesis 
can now try their hand with these inter- 
mediates, which are generated from simple 
building blocks. 
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Why we do 
what we do 


From regenerating sea slugs 
to self-medicating sheep, 

a biologist probes the origins 
and evolution of behavior 


By Rob Dunn 


any of us have become estranged 

from the rest of the living world. 

We live inside and, in those rare 

moments outside, are as likely to 

be annoyed by other species as in- 

spired. Just this week, one of my 
neighbors asked how to kill the chipmunks 
in his yard, another how to get rid of rab- 
bits, and still another wanted “a solution” 
for the robins that were “flinging mulch 
into his driveway in their quest for worms.” 
Even when wild animals show themselves, 
we seem ill-prepared to pay attention. 

In her new book, Dancing Cockatoos and 
the Dead Man Test: How Behavior Evolves 
and Why It Matters, Marlene Zuk docu- 
ments the behavior of the animals around 
us on our behalf. She watches and writes 
with a sense of wonder, curiosity, and the 
abiding recognition that our own human 
lives only make sense in light of the behav- 
ior of other species. 

In this framing, Zuk implicitly advances 
what to some will be a radical idea: that 
humans are just another animal species. 
We may be unusual, and hence “special,” in 
some of our behaviors, but so too, she ar- 
gues, is the sea slug that abandons its body 
when attacked by parasites only to grow 
a new one from its disembodied head. To 
Zuk, an animal behaviorist, the world is 
full of species with individual stories, spe- 
cies that teach about life’s general rules 
and exceptions, and species from which we 
can and should learn. 

Are behaviors governed by nature or nur- 
ture? Zuk comes down in favor of a third 
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perspective in the book’s opening chapter. 
Genes, she argues, influence behavior, but 
how they do so depends on the environ- 
ment. Similarly, the environment influ- 
ences behavior, but how it does so depends 
on the genes. Here, as in every subsequent 
chapter, Zuk makes her point through the 
stories of a cavalcade of one species after 
another (including, eventually, humans). 
Then, she introduces exceptions and cave- 
ats, critiques, and counterarguments. One 
has the feeling of being inside a mind that 
is constantly weighing every perspective, 
ever aware that the work of making sense 
of the world is far from done. 

Each chapter considers a general con- 
cept or a question. The second chapter ex- 
plores how behavior evolves and the third 
how behavior is inherited. The 
fourth documents the surpris- 
ingly complex story of the evo- 
lution of the behavior of dogs. 
Later chapters describe the evo- 
lution of the behavior of domes- 
tic animals more generally, the 
evolution of human language, 
mental health among animals 
(including humans), big brains, 
sex, sex roles, and disease. 

In the book’s timely final 
chapter, Zuk describes how the 
behaviors of animals can be influenced by 
disease. Some fungi, for example, take over 
the bodies of ants and cause them to climb 
high onto trees where the fungi can more 
readily disperse. Jewel wasps, meanwhile, 
inject venom into cockroaches that makes 
them easy to control and coax home (as 
food for their babies). 


aycine COCKITDRS 
- Deno MAN TEST 


Dancing Cockatoos 
and the Dead Man Test 
Marlene Zuk 
Norton, 2022. 352 pp. 


The book also considers those cases in 
which animals’ behaviors help them to 
avoid disease. Chimpanzees self-medicate 
by eating plants that help to kill their in- 
testinal parasites, as do goats and sheep. 
Some populations of house sparrows bring 
cigarette butts into their nests to kill ticks. 
Ants gather antimicrobial resins and incor- 
porate them into their mounds. 

After exploring these examples, Zuk re- 
minds us of very similar phenomena in hu- 
mans that are neither fully genetic nor fully 
environmental in origin. Like ants, we can 
be manipulated by microbes. Humans who 
received the flu vaccine, for example, have 
been shown to be much more likely to be 
social (without realizing it) in the 48 hours 
after the shot than they had been before- 
hand, behavior that could poten- 
tially benefit a live virus. Like 
chimpanzees, humans use plants 
as medicines. Like many animals, 
humans employ social distancing 
in the presence of parasites. And 
like one African ant species, hu- 
mans use a mix of techniques to 
wash pathogens off their bodies 
to reduce the risk of infection. 

In the end, Zuk’s lovely book 
feels like a cabinet of curiosities 
whose details remind us to pay 
attention to the behaviors around us every 
day. Don’t try to kill the chipmunk, the rab- 
bit, or the robin. Watch them. Learn from 
them, and in doing so, learn also about 
yourself. Our story, however important it 
feels to us, is just one among millions. & 
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SCIENCE AND SOCIETY 


Inequality goes viral 


Structural factors exacerbate the impact of viruses 


By Ayah Nuriddin 


ichael Johnson was an under- 

graduate student and _ athlete 

at Lindenwood University near 

St. Louis, Missouri, when he was 

arrested for “recklessly” trans- 

mitting HIV to five men in 2013. 
Throughout his trial, Johnson was labeled 
as predatory, hypersexualized, and a vec- 
tor of disease. His identity as a gay Black 
man was put on trial as well. Johnson re- 
fused to plead guilty, arguing 
that he had notified his part- 
ners of his HIV status and 
had therefore not committed 
a crime. He was convicted 
and sentenced to 30 years in 
prison, but his conviction was 
eventually overturned and his 
sentence vacated because it 
was found that the prosecu- 
tion had withheld evidence. 
He was released in 2019. 

Johnson, according to Ste- 
ven Thrasher, professor of 
journalism at Northwestern 
University, is a member of 
the “viral underclass”—those 
individuals who bear the 
disproportionate impact of 
virus transmission, morbid- 
ity, and mortality because of 
structural inequality. Borrow- 
ing the term from HIV activ- 
ist Sean Strub, in his new 
book, The Viral Underclass, 
Thrasher argues that viruses 
amplify existing power struc- 
tures and exacerbate the im- 
pact of viral infection on people of color, 
the poor, the disabled, and those who iden- 
tify as LGBTQ+. 

The viral underclass is a global phe- 
nomenon, but, according to Thrasher, it is 
deeply connected to “the particular cruel- 
ties of the American empire.” Thrasher’s 
beautifully written account illustrates the 
complex and textured relationship be- 
tween disease and inequality, building on 
decades of scholarship and activism from 
Black, brown, queer, and disabled people 
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who understood that the disproportionate 
burden of disease is the result of the struc- 
tural nature of health inequality. 

The book focuses primarily on two vi- 
ruses—HIV and COVID-19—identifying 12 
interconnected social vectors that lead to 
unequal virus transmission: racism, indi- 
vidualized shame, capitalism, the law, aus- 
terity, borders, the liberal carceral state, 
unequal prophylaxis, ableism, speciesism, 
the myth of white immunity, and collec- 
tive punishment. Each vector receives its 


Volunteers conduct COVID-19 outreach in Rochester, New York, in 2020. 


own chapter in which Thrasher relates the 
personal stories of individuals who have 
experienced the burden of systemic viral 
inequality, providing a lens into the differ- 
ent and devastating ways that it produces 
the viral underclass and the beautiful and 
heartbreaking ways that individuals and 
communities survive, resist, and find joy 
despite marginalization. 

A virus can have different social conse- 
quences, depending on which vectors it 
intersects with. The book’s first chapter, 
“Mandingo: Racism,” opens with the mur- 
der of George Floyd and then focuses on 
Michael Johnson’s story. “From Athens to 
Appalachia: Austerity” centers on the life 
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and brutal murder of Greek drag queen 
and HIV activist Zak Kostopoulos amid 
austerity measures and subsequent police 
violence in Greece and shows readers how 
austerity measures exacerbate suffering 
for those disproportionately affected by vi- 
ral inequality. “Borderlands: Borders” de- 
scribes the vulnerability of being trans and 
undocumented through the life of trans 
HIV activist Lorena Borjas, considered 
“the mother of the trans Latinx commu- 
nity” in Queens, New York, who ultimately 
died alone of COVID-19. To- 
gether, these and other sto- 
ries are woven into a rich 
tapestry that narrates the 
impact of inequality on hu- 
man experience. 

In Thrasher’s analysis, vi- 
ruses often serve as a proxy for 
disease as a category. The book 
would have been enhanced, 
however, if he had drilled 
down into the particulari- 
ties of specific viruses—polio, 
influenza, human papilloma- 
virus, or monkeypox, for exam- 
ple—and how each reveals or 
obscures different dimensions 
of inequality. Likewise, includ- 
ing other agents of disease 
might have further enriched 
the book’s arguments. A mi- 
crobial underclass or an infec- 
tious underclass could expand 
a theory of the viral underclass 
to illustrate the unequal and 
disproportionate impacts of 
conditions such as tuberculo- 
sis, malaria, or asthma. 

Thrasher’s theory of the viral underclass 
creates opportunities to imagine new possi- 
bilities for a better world, or, as he writes, 
it “provides a map for a kind of worldwide 
liberation.” Identifying and addressing the 
vectors that produce the viral underclass 
would not only improve unequal virus trans- 
mission, it “would improve life on earth for 
nearly everyone.” 

Dismantling structural viral inequality 
requires recognition that the vectors that 
lead to it are deeply interconnected. Solu- 
tions will require us to imagine our desti- 
nies as similarly intertwined. 


10.1126/science.add5428 
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Mature common beech trees are among the species that can improve urban quality. 
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Ozone-reducing urban 
plants: Choose carefully 


Ozone (O,) pollution is a threat to human 
health, vegetation, biodiversity, and climate 
(2). Millions of urban citizens face O, lev- 
els above the World Health Organization 
standards, leading to nearly 150,000 deaths 
worldwide in 2019 (2). By 2050, 70% of the 
world population will reside in cities (3). 
Policies are urgently needed to reduce O, 
levels and prevent further deaths. Urban 
plants are one promising strategy (4). 
Plants can reduce air pollution by 
adsorbing pollutants on their surfaces or 
absorbing them through their leaves or 
needles (5). Green urban infrastructure 
can mitigate O, pollution, and mature 
urban trees show higher O,-removal 
capacity than, for example, green roofs, 
shrubs, or green walls (6). However, 
not all trees are the same—to maximize 
benefits, tree species should be selected 
based on their O,-removal capacity. Top 
O,-reducing tree species include com- 
mon beech, small- and large-leaved lime, 
London plane, sycamore maple, Norway 
maple, tulip tree, horse chestnut, and 
turkey oak (6, 7). Conversely, some spe- 
cies form more O, than they remove, such 
as blue gum, Italian and pubescent oak, 
spruce, hazel pine, weeping willow, and 
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common myrtle (6, 7). Generally, broadleaf 
tree species remove more O, than conifers 
(8). Worldwide urban greening programs 
should make sure to select species that 
improve air quality. 

To maximize green O, removal, urban 
strategies should also consider environmen- 
tal conditions such as meteorology, soil, 
and air quality and select the species best 
adapted to the region. Every region should 
find O,-reducing species that are also long 
lived and low maintenance, with the greatest 
total leaf area possible. Although planting 
trees will not be able to counterbalance all 
anthropogenic O, pollution, choosing the 
right species can maximize the benefits of 
this valuable air quality strategy. 

Pierre Sicard'*, Evgenios Agathokleous?, 


Alessandra De Marco’, Elena Paoletti* 

TARGANS, Sophia Antipolis, France. *School 

of Applied Meteorology, Nanjing University of 
Information Science and Technology, Nanjing, 
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Ecosystems-—National Research Council, Sesto 
Fiorentino, Italy. 
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Climate change threatens 
Pakistan’s snow leopards 


The snow leopard (Panthera uncia) popu- 
lation currently spans the mountainous 
regions of 12 countries, including more 
than 80,000 square kilometers in northern 
Pakistan (7). As a result of human encroach- 
ment and hunting, snow leopards are clas- 
sified as Vulnerable (2). Climate change is 
exacerbating the threats the snow leopard 
already faces as well as transforming their 
environment in ways that make survival 
more difficult. Pakistan must take urgent 
action to protect this important species. 

Since the partition of India in 1947, 
the human population in the snow leop- 
ard’s range in Pakistan has quadrupled, 
and livestock numbers have increased 
by 40 to 60% (3). Habitat loss driven by 
human activities and economic growth has 
fragmented the snow leopard’s habitat. 
Overgrazing has caused severe degradation 
of grasslands, and an overabundance of 
livestock has encroached on the habitat of 
wild ungulates, decreasing the prey avail- 
able to snow leopards (4). 

Growing human populations and live- 
stock in the snow leopard’s habitat have 
led to increased illegal trafficking, poach- 
ing, and human-wildlife conflict. On aver- 
age, one snow leopard per day has been 
lost to poaching during the past decade 
(5). Snow leopards are also frequently 
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killed in reprisal for attacks on local herd- 
ers’ livestock (6). 

In addition to these harmful trends, global 
warming is causing the forest line in moun- 
tainous regions to move to higher altitudes, 
compressing the suitable habitat available 
to snow leopards. Changing climate has also 
allowed leopards and other low-altitude 
carnivores to migrate to higher altitudes, 
increasing the snow leopard’s competition 
(7). Furthermore, reduced glacier cover will 
lead to increased drought risk, changes to 
local flora and fauna, and fewer food alter- 
natives for species like the snow leopard (8). 

The snow leopard is a top predator and 
an indicator of the overall health of its 
high-altitude habitat (9). To protect the spe- 
cies, Pakistan must mitigate the effects of 
climate change and human settlements (J0). 
Priorities for climate-informed conservation 
should include passing legislation to prevent 
poaching and retaliatory killings. Predator- 
proof livestock cages could decrease the 
chances of human-wildlife conflict. Helping 
people in the snow leopard’s range adapt 
to the effects of climate change will also 
benefit wildlife, as will continuing efforts to 
minimize habitat degradation. Training and 
education could help tourists and the pub- 
lic avoid the species’ habitat. Researchers 
should learn more about the species’ biology, 
especially its genetics and the diseases to 
which it is susceptible. To monitor progress, 
Pakistan should track the number of snow 
leopards and look for trends in population 
size and distribution. 
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Primary forest loss in 
biodiverse Indian states 


Primary forests—old-growth forests that 
have remained undisturbed by human- 
kind—comprise one-third of the world’s 
forests. These dense, wild forests are 
important habitats for unique species 

(1) and provide a variety of ecosystem 
services, including global biosphere- 
atmosphere CO, exchange (2). Such forests 
are irreplaceable in terms of biodiversity 
value and ecosystem services, and replan- 
tation to compensate for such forest loss 
is inadequate (3). Yet loss of primary for- 
ests continues all over the world, and gov- 
ernments have failed to protect them (4). 
For example, in India, primary forest is 
being lost in biodiversity hotspots, which 
should be environmental priorities. India’s 
government must acknowledge the value 
of primary forests and prioritize their 
conservation by committing to holistic 
conservation strategies. 

India is home to four global biodiversity 
hotspots, including the Himalayas and the 
Indo-Burma region. These areas, with high 
altitudes and unique climates, support a 
variety of ecosystems and harbor distinct 
species diversity and endemism (5, 6). 
However, nine states in the Himalayas and 
Indo-Burma regions accounted for 89% of 
total primary forest loss in India between 
2015 and 2021 (7). Between 2002 and 2021, 
India saw a 3.6% reduction in the total 
area covered by humid primary forest (7). 


NEXTGEN VOICES: SUBMIT NOW 


The ongoing primary forest loss 
will have detrimental effects on global 
environment and biodiversity. In these 
high-altitude areas, the rate of regional 
warming has exceeded the rate of global 
warming (8). The massive forest loss, cli- 
mate extremes, increased temperatures, 
and altered rainfall pattern in these 
biodiversity hotspots intensify competi- 
tion for species survival, leading to both 
species extinction and changes in species 
distribution (9, 10). Deforestation in the 
Himalayas alone is expected to cause the 
extinction of a quarter of endemic species 
by 2100, including 366 vascular plant taxa 
and 35 vertebrate taxa (11). 

Scientists, activists, and the public must 
persuade local and national governments 
to pass legislation and implement holistic 
strategies that halt primary forest loss. 
Sustainable forest conservation strategies 
include community conserved forest man- 
agement, in which the residing communi- 
ties (mostly native and tribal peoples) are 
given rights to the forest, and strict prohi- 
bition of forest land diversion for nonfor- 
estry purposes, including infrastructure. It 
is time to secure the long-term integrity of 
these irreplaceable primary forests. 
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Comment on “Pushing the frontiers of density func- 
tionals by solving the fractional electron problem” 


Igor S. Gerasimov et al. 

Kirkpatrick et al. (Reports, 9 December 2021, 
p. 1385) trained a neural network-based DFT 
functional, DM21, on fractional-charge (FC) 
and fractional-spin (FS) systems, and they 
claim that it has outstanding accuracy for 
chemical systems exhibiting strong correla- 
tion. Here, we show that the ability of DM21 
to generalize the behavior of such systems 
does not follow from the published results 
and requires revisiting. 

Full text: dx.doi.org/10.1126/science.abq3385 


Response to Comment on “Pushing the frontiers 
of density functionals by solving the fractional 
electron problem” 


James Kirkpatrick et al. 

Gerasimov et al. claim that the ability of 
DM21 to respect fractional charge (FC) and 
fractional spin (FS) conditions outside of 
the training set has not been demonstrated 
in our paper. This is based on (i) asserting 
that the training set has a ~50% overlap 
with our bond-breaking benchmark (BBB) 
and (ii) questioning the validity and accuracy 
of our other generalization examples. We 
disagree with their analysis and believe 
that the points raised are either incorrect 
or not relevant to the main conclusions of 
the paper and to the assessment of general 
quality of DM21. 

Full text: dx.doi.org/10.1126/science.abq4282 
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NEUROMODULATION 


Synchronizing neural rhythms 


Personalized, noninvasive network-based neuromodulation 


aids impaired cognition 


By Robert M. G. Reinhart! 


ognitive brain disorders are among 

the most disabling health states in 

the world. Existing pharmacological, 

surgical, and behavioral therapeutic 

approaches for impaired cognition 

are limited by heterogeneous treat- 
ment outcomes, slow symptom resolution, 
and accompanying risks and side effects. 
Consequently, there is an urgent need to 
develop innovative and personalized thera- 
peutic interventions that are capable of pro- 
viding rapid and sustainable improvements 
with minimal side effects. Noninvasive neu- 
romodulation technology is an emerging 
class of tools that offer such translational 
potential for neurocognitive disorders. 
Tools such as high-definition transcranial 
alternating current stimulation (HD-tACS) 
offer unprecedented control in modulating 
rhythmic activity in cortical regions that are 
implicated in neurocognitive dysfunction 
(1). Our latest research in precision or per- 
sonalized HD-tACS shows promise at non- 
invasively manipulating neural population 
dynamics and improving human cognition 
and adaptive behavior in a selective and 
long-lasting fashion (2-5). 

Our neuromodulation design is grounded 
in fundamental neuroscience research that 
views cognition as arising from synchronous 
electrophysiological rhythms across mul- 
tiple spatiotemporal scales (2, 4-6). Neural 
rhythms derive from inhibitory and excit- 
atory postsynaptic currents and are evident 
as cyclic changes in the voltages of local 
field potentials, electrocorticograms, elec- 
troencephalograms, and magnetoencephalo- 
grams. Functionally, synchronized rhythms 
sculpt neurophysiological dynamics to fa- 
cilitate precise yet flexible communication 
necessary for goal-directed action and cogni- 
tion (/, 6). This synchronization is achieved 
by gating information transmission during 
windows of high excitability through tran- 
sient, phase-coordinated local neuronal spik- 
ing within and across neuronal networks. 
By coordinating spike timing, synchroniza- 
tion increases the likelihood of inducing 
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spike timing-dependent plasticity, thereby 
promoting flexible cognitive function. As a 
result, neural synchronization is recognized 
as a fundamental neurophysiological process 
that is necessary for human cognition. 

The synchrony of neural rhythms within 
or between cortical regions can be changed 
noninvasively using tACS. This technique 
delivers low-intensity alternating currents 
on the scalp using a predetermined topo- 
graphical arrangement of electrodes to in- 
duce alternating electric field gradients in a 
target brain region. These alternating elec- 
tric fields manipulate spike timing through 
phase alignment of neuronal activity, with- 
out necessarily affecting the spiking rate of 
the neuronal population (7). The entrain- 
ment of neuronal activity can lead to the 
induction of neuroplasticity, which allows 
any neural and cognitive changes to persist 
beyond the duration of neuromodulation. 
Moreover, simultaneous application of two 
(or more) tACS currents targeted to differ- 
ent brain regions can be used to modulate 
phase synchronization among them (3). 
Thus, tACS can be used to noninvasively 
manipulate synchronized rhythmic activity 
within and between cortical networks. 

Advances in tACS technology and proto- 
col design offer expanded functional and 
anatomical targeting over conventional 
tACS. Multichannel HD-tACS designs, which 
involve substantially smaller electrodes ar- 
ranged in a center-surround configuration, 
have been shown to produce more focal 
stimulation in cortical targets (8). The spatial 
arrangement of these electrodes on the scalp 
is guided by computational models of electric 
fields, which further improves the focality of 
stimulation (8). Enhanced focality minimizes 
the likelihood of producing unwanted effects 
in other anatomical structures such as the 
retina or the extracranial tissue. Recently, 
considerable attention has been devoted to 
the personalization of HD-tACS (9). Here, 
key properties of individual neurophysiology 
such as anatomical variability or intrinsic 
rhythmic frequency of a neural circuit are 
considered. Using a combination of these de- 
velopments, we have demonstrated effective, 
functionally specific, bidirectional, and long- 
lasting control over network synchronization 
patterns that support cognitive function (2-5, 
10, 11). The integration of these advances into 
the development of personalized, highly fo- 
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cal designs offers a promising opportunity to 
better steer the plasticity mechanisms of hu- 
man cognition. 

We recently discovered that synchroni- 
zation-dependent neural coding schemes 
underlie poorer memory function in people 
aged 60 to 76 years and developed advanced 
neuromodulation protocols that target these 
motifs for memory enhancement (see the 
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online figure, top). Before neuromodula- 
tion, these individuals showed poorer work- 
ing memory performance compared with 
younger adults (2). These impairments were 
found to be associated with reduced theta- 
gamma phase-amplitude coupling (PAC) in 
the temporal cortex (2). PAC is a well-stud- 
ied neural coding motif that occurs when 
the amplitude of a high-frequency rhythm 


synchronizes with the phase of a low-fre- 
quency rhythm. This form of synchroniza- 
tion facilitates the integration of informa- 
tion across spatiotemporal scales within a 
nested cortical network (6, 12). We found 
that local PAC deficits in the temporal cortex 
arose because of deficient prefrontal control 
marked by reduced theta-phase synchroni- 
zation between the frontotemporal areas. 
Phase synchronization—when two or more 
rhythmic neuronal signals tend to cycle with 
consistent relative phase—is another lead- 
ing neural coding motif for coordinating 
spatiotemporal neuronal activity (, 6, 12). 
These synchronization schemes thus serve 
as potential targets for neuromodulation to 
improve memory function. 

Guided by electric field modeling, we de- 
veloped a personalized HD-tACS protocol 
to rescue theta-phase synchronization in 
the frontotemporal cortex. The frequency 
of synchronization was individually deter- 
mined for each participant to maximize the 
likelihood of entrainment. Simultaneous 
in-phase entrainment of both frontal and 
temporal regions at personalized theta 
frequencies induced in this manner re- 
stored intrinsic frontotemporal theta-phase 
synchronization, recovered the deficient 
theta-gamma PAC in the temporal cortex 
(see the online figure, top), and improved 
working memory performance in older 
adults (2). Even though neuromodulation 
was performed for ~25 min, improvements 
in memory function were sustained for at 
least 50 min, suggesting that the protocol 
produced neuroplastic changes outlasting 
the modulation period (2). Moreover, an ad- 
ditional experiment in younger adults with 
antiphase synchronization of frontotempo- 
ral regions demonstrated that memory per- 
formance can even be down-regulated. This 
finding suggests that cognitive function can 
be bidirectionally manipulated using phase- 
dependent interregional synchronization. 
This property of precision neuromodulation 
may be useful in pathologies where overac- 
tive memory processes need to be regulated, 
such as in posttraumatic stress disorder. 

Our precision neuromodulation approach 
identified that it was essential to perform 
HD-tACS using personalized theta frequen- 
cies. By contrast, control experiments with 
a fixed theta frequency for all participants 
did not produce any improvements in mem- 
ory function in older adults. Thus, advances 
in noninvasive neuromodulation that lever- 
age the spatial and spectral parameters of 
individual neurophysiology offer a promis- 
ing opportunity to effectively synchronize 
large-scale brain rhythms and rapidly im- 
prove memory function in older people. 
Such developments are especially valuable 
considering the rapidly aging global popu- 
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lation and its associated personal, social, 
health care, and economic costs. 

Current theories in biological psychiatry 
on the nature of compulsivity, including 
obsessive-compulsive disorder (OCD), view 
symptoms as outcomes of dysregulated 
habits and atypical reward processing due 
to abnormalities in cortico-basal ganglia 
networks (13, 14). In parallel, fundamental 
neuroscience research has identified a neu- 
ral signature in the form of medial-frontal 
beta-gamma rhythms, presumed to arise 
from the orbitofrontal cortex (OFC) during 
reward processing (see the online figure, 
bottom) (75). Combining these insights, we 
proposed that beta-gamma rhythms may 
constitute the neural code underlying orbi- 
tofrontal-striatal interactions that give rise 
to abnormal reward processing and OCD 
symptoms. To test this theory, we devised 
a personalized model-guided HD-tACS pro- 
tocol for targeting individual beta-gamma 
rhythms of the OFC (see the online figure, 
middle) and demonstrated rapid, revers- 
ible, frequency-specific modulation of re- 
ward-guided choice behavior and learning 
in healthy young adults (4). Next, by repeat- 
edly modulating personalized OFC beta- 
gamma rhythms over 5 days, we effectively 
reduced obsessive-compulsive behaviors in 
a nonclinical population. The rapid reduc- 
tion in obsessive-compulsive behaviors—in- 
cluding hoarding, ordering, and checking— 
lasted for at least 3 months (4), and the 
largest improvements were experienced in 
people with more severe symptoms. These 
findings bode well for extending this per- 
sonalized neuroscience intervention to peo- 
ple with clinical OCD and other compulsiv- 
ity disorders, such as behavioral addiction 
(e.g., gambling, internet), eating disorders, 
substance use or abuse, and Tourette syn- 
drome. More broadly, because the OFC is 
increasingly recognized to play a central 
role in the pathophysiology of mood, anxi- 
ety, psychosis, and other major categories 
of psychiatric disorders (14), the noninva- 
sive procedure we developed for selectively 
modulating OFC beta-gamma rhythms 
could lay the basis for future nonpharmaco- 
logical therapeutics that are applicable to a 
wide range of psychiatric illnesses. 

The fields of fundamental and clinical 
neuroscience have made extraordinary ad- 
vances in understanding the dynamic struc- 
ture of the neuronal network activity that 
underlies cognitive function and dysfunc- 
tion. Leveraging these insights has allowed 
us to develop neuromodulation protocols, 
personalized to individual neurophysiology, 
that can selectively augment components 
of rhythmic cortical networks and improve 
cognitive function and adaptive behavior in 
a rapid and sustainable fashion. Although it 
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is challenging to predict the future, we are 
optimistic that personalization rooted in 
the neuroscience of network dynamics will 
rise to the forefront of next-generation non- 
invasive neuromodulation and pave the way 
toward future use of precision electroceuti- 
cals in neurology and psychiatry. | 
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NEUROMODULATION 


Ultrasound neuromodulation 


of the deep brain 


Noninvasive, reversible stimulation of neural 


circuits can regulate behavior 


By Davide Folloni 


nderstanding the relationship be- 

tween the brain and behavior is one 

of the main goals of neuroscience 

and is crucial for the development 

of rehabilitative technologies. When 

it comes to humans, however, there 
is still no tool able to modulate activity in 
deep areas of the brain with millimetric res- 
olution and in a noninvasive and reversible 
fashion. Beyond neurorehabilitation, the 
understanding of the functional role of a 
brain area can be established only through 
causal inference—that is, by manipulating 
its activity and observing associated behav- 
ioral changes. 

Current reversible neuromodulation 
methods have limited spatial resolution and 
do not reach regions deep in the brain (J). 
For this reason, my colleagues and I decided 
to implement an alternative approach, 
called focused transcranial ultrasound 
stimulation (TUS) that relies on the strong 
interactions between low-intensity sound 
waves and brain tissue to transiently modu- 
late neural activity noninvasively, revers- 
ibly, and with high spatial precision. Our 
work shows that TUS, compared with mag- 
netic or electric fields, can directly focus 
sound waves to small regions deep in the 
brain without affecting the overlying cortex 
and with results previously only achievable 
using surgically implanted electrodes and 
other invasive procedures (2). 

Activity in each brain area is modulated 
by the network of regions with which that 
area is interconnected. Such connections 
determine which regions influence an area 
and, in turn, which regions are influenced 
by that area. In an initial set of experi- 
ments (2), we developed a multimodal ap- 
proach combining TUS with whole-brain 
resting-state functional magnetic reso- 
nance imaging (fMRI) in macaque mon- 
keys to selectively manipulate neural ac- 
tivity in a subcortical area (the amygdala) 
and in a deep cortical region [the anterior 
cingulate cortex (ACC)] (see the figure, up- 


Icahn School of Medicine at Mount Sinai, New York, NY, USA. 
Email: davide.folloni@mssm.edu 


per left). Whole-brain fMRI enabled the re- 
cording of poststimulation activity in both 
the stimulated area and interconnected 
neural circuits. 

By using a 40-s offline TUS protocol, 
with a wave frequency set to the 250-kHz 
resonance frequency and 30-ms bursts of 
ultrasound generated every 100 ms, we 
demonstrated for the first time in primates 
(which are characterized by a thick skull 
bone, similar to that in humans) that TUS 
can be used transcranially to modulate ac- 
tivity in both the amygdala and ACC for 
up to 2 hours with high spatial specificity 
and without affecting the overlying cortex 
(2). TUS targeted to the amygdala resulted 
in focal changes in the activity coupling of 
this region with a set of areas compared 
with a no-stimulation (i.e., sham) condition. 
However, this modulation was not present 
if we targeted TUS to an active control area, 
such as the ACC. The same focal effects— 
now centered on the neuromodulation of 
the ACC—were observed when we sonicated 
this latter region (2). We replicated these ef- 
fects successfully in other cortical regions 
(3). The duration of these effects and their 
reversibility may present therapeutic op- 
portunities in patients with dementia, mo- 
tor-related pathologies, and psychiatric and 
neurological conditions. 

Transcranial ultrasound modulation of 
deep brain regions could be groundbreak- 
ing in health care, but only if it is able to 
translate into a behavioral change. We 
therefore continued our research by ap- 
plying TUS to awake-behaving monkeys 
trained to perform a decision-making 
task in which they had to learn the value 
of choice options (4). The animals’ choice 
strategies were typically supported by ac- 
tivity in the same ACC region that had 
been targeted previously (2) and the hip- 
pocampus. The ACC specifically tracked 
the value of all potential counterfactual 
choices during a trial, and its activity pre- 
dicted the translation of this value into a 
change in future better choices. TUS tar- 
geted offline to the ACC led to a substan- 
tial alteration in this behavior (4). In sub- 
sequent experiments (5, 6), we reproduced 
the selective neuromodulatory effects of 
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TUS on behavior, using it to modulate ac- 
tivity in the basal forebrain and medial 
prefrontal cortex. 

A brain area’s role is partly a function of 
the interactions that it has with other brain 
regions. Because our previous results show 
that each area’s interactions can be altered 
in a substantial way (2-6), it suggests that 
by temporarily modulating activity within 
a deep brain area using TUS, this area’s 
ability to contribute to a cognitive process 
can be disrupted. Although we had learned 
about the neural (2) and behavioral (4) ef- 
fects of TUS separately, we still did not 
know whether deep brain TUS caused si- 
multaneous changes in both activity—at the 
target as well as within its interconnected 
network—and behavior. 

To answer this, we applied TUS to ma- 
caque monkeys performing a decision- 
making task while simultaneously record- 
ing their neural activity using an MRI 
scanner (7). The animals were required to 
flexibly learn contingent associations be- 
tween choice options and outcomes and use 
these associations to guide subsequent be- 
havior—a process called credit assignment 
[suggested to be encoded in the ventral pre- 
frontal cortex (vPFC), more specifically in 
area 47/120] (8, 9, 10). To causally test the 
role of 47/120 in credit assignment, TUS was 
applied to the animals’ 47/120 areas imme- 
diately before they entered the MRI scanner 
to perform the task. The resultant changes 
in behavior and associated neural activity 
were compared with a sham condition and 
with a third active-control condition, where 
TUS was instead applied to the adjacent 
anterior prefrontal cortex (aPFC) (see the 
figure, upper right). By exploiting the tran- 
sient and noninvasive properties of our TUS 
protocol, we were able to interleave sham 
and TUS sessions within each animal and 
counterbalance their order across all ani- 
mals, thereby controlling for any potential 
order effect. 

Primates, including humans and ma- 
caques, learn how to make choices by identi- 
fying over time the pattern of outcomes that 
follow a specific action. In our daily life, it 
is essential to know whether the beneficial 
consequences that follow our choices are 
simply a feature of the current environment 
we find ourselves in or whether these out- 
comes are genuinely caused by the choices 
we made. Subject animals were generally 
good at learning from previous outcomes, 
and this behavior was associated with activ- 
ity in the vPFC, including area 47/120. Focal 
TUS modulation of 47/120 activity, however, 
disrupted such learning and the animals’ 
ability to integrate the observed positive 
outcomes with the concomitant choices re- 
cently taken (7). 
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(A) Focused transcranial ultrasound stimulation (TUS) applied to the primate anterior cingulate cortex (ACC) in 
a macaque. (B) Experimental TUS conditions: sham, where no stimulation was applied (blue); TUS targeting brain 


area 47/120 (red); and an active control condition (green) in which TUS was applied to the anterior prefrontal cor- 
tex (aPFC). (C) These plots indicate the influence of past reward history (x axis) and choice history (y axis) on which 
choice is taken next during the three conditions. The brighter (i.e., lighter gray) the main diagonal, the stronger 

the influence of the conjoint history of choice and reward on future behavior. Labels t-1, t-2, and t-3 refer, respec- 
tively, to the last occasion, the previous occasion, and the occasion before that on which a given choice option was 
encountered. In sham, choices are influenced by the conjoint history of choice and reward. (D) In the sham condi- 
tion, activity in several frontal cortical areas, including the ACC, reflected values of choices that could be taken. This 


value signal in the ACC was significantly and selectively reduced after 47/120 TUS but not after aPFC TUS. 


We next used a complementary approach 
to examine how the extended history of 
choice-reward contingencies experienced 
would influence which choice a macaque 
would make in the future (8, 10). We com- 
bined TUS and fMRI with computational 
models to show that two separate subfields 
of cortex encode these two features of re- 
wards: Area 47/120 activity reflects the pro- 
cess of credit assignment, whereas the adja- 
cent anterior insula tracks a general signal 
reflecting the value of the environment. TUS 
targeted to 47/120 (see the figure, middle, 
red) selectively changed the animals’ ability 
to learn choice-reward contingencies com- 
pared with sham (see the figure, middle, 
blue). This impairment was absent if TUS 
was applied to the nearby control aPFC (see 
the figure, middle, green) (7). 

Credit assignment encoded in area 47/120 


is pivotal in our everyday behavior because 
it determines the value assigned to every 
choice we make. Ultrasound-based disrup- 
tion of this area not only affected credit 
assignment but also resulted in activity 
changes in an extended cortical circuit, in- 
cluding a subregion of the ACC where we 
had previously shown that choice values are 
held and compared during decision-making 
(see the figure, middle, red) (4). Notably, de- 
spite the result that TUS applied to 47/120 
modulated functionally dependent signals 
in these distant components of monosyn- 
aptically interconnected neural circuits 
(see the figure, bottom), such alteration of 
47/120 left functionally independent activ- 
ity (e.g., signals in the immediately adjacent 
anterior insula that kept tracking the gen- 
eral value of the environment regardless of 
the choice taken) intact (7). Although these 
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studies were done in nonhuman primates, 
they suggest the possibility of safely trans- 
lating TUS to humans, especially in those 
brain circuits that are evolutionarily pre- 
served across species (11). 

Over the past few years, we have dem- 
onstrated that the noninvasive properties 
of TUS and its high spatial resolution have 
the potential to substantially enhance our 
causal understanding of how the human 
brain works as well as our development of 
therapeutic protocols targeting abnormal 
activity in deep brain regions. TUS’s ability 
to provide manipulation techniques previ- 
ously only available in rodent models is 
likely to critically affect future human brain 
investigations. 

The possibility of using a short and of- 
fline protocol makes TUS a promising 
cutting-edge treatment for psychiatric and 
neurological patients, especially when their 
symptomatology prevents the application 
of longer or invasive treatments. Finally, the 
possibility of reaching reversible modula- 
tion of neural activity with millimetric reso- 
lution even in subcortical brain areas may 
finally provide surgeons with a presurgical 
tool to identify, in each of their patients, 
which portions of gray or white matter are 
more likely to lead to an improvement of 
symptoms after implantation of deep brain 
stimulation electrodes. 
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SPECIAL SECTION 


GRASS 


PERSPECTIVE 


The history and challenge of grassy biomes 


Grassy biomes are >20 million years old but are undervalued and under threat today 


By Caroline A. E. Str6mberg? 
and A. Carla Staver?? 


rassy biomes—from the steppes of 

Mongolia to the savannas of Tan- 

zania—are predicted to be the eco- 

systems hardest hit by the ongoing 

climate and land use crises. The 

history of humans has been pro- 
foundly intertwined with grassy biomes. 
Homo evolved in the savannas 2 million 
years ago (Ma), and agricultural societies 
arose through the domestication of grasses, 
such as wheat and barley, 10,000 years ago. 
These grass crops, as well as corn and rice, 
remain dominant staple foods globally (J). 
Livestock production also centers in areas 
that were once (and sometimes still are) na- 
tive grasslands. Grassy biomes harbor dis- 
tinct and diverse sets of plants and animals 
that have adapted to these environments 
through millions of years of evolution (2). 
As the biodiversity and economic promi- 
nence of grassy biomes are increasingly be- 
ing recognized, there is a demand for bet- 
ter understanding of their past and present 
function to inform policy and management. 

Grassy biomes are biogeographically wide- 
spread, accounting for >25% of all land on 
Earth, including 35% of the tropics and sub- 
tropics. The emergence of grassy systems 
during the Cenozoic (the past 66 million 
years) was complex, shaped by climate, soils, 
fire, and herbivory in ways that are not fully 
understood (see the figure). Clarifying these 
mechanisms will be key for managing the 
fate of grassy biomes under ongoing and fu- 
ture environmental changes that are driven 
by human activities. 

Grasses, defined as plant species in the 
family Poaceae, originated by the Late Creta- 
ceous (100 Ma) (3) but did not become eco- 
logically dominant until >70 million years 
later, in the later Cenozoic. This exceptionally 
long lag has prompted evolutionary biologists 
and paleontologists to search for the drivers 
that allowed grass to reach its current global 
prominence. Today, most grasses are associ- 
ated with open-canopy habitats, owing to sev- 
eral traits acquired relatively early in Poaceae 
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evolution (100 to 60 Ma) (J, 3). For example, 
grasses may have quickly evolved a rapid life 
cycle and persistent buds, permitting quick 
regrowth after drought, frost, or disturbances 
such as fire and grazing. Starting by 55 Ma, 
several groups of grasses evolved so-called 
C, photosynthesis (as opposed to C, photo- 
synthesis), which allows them to prosper in 
hot and dry areas (J). In colder climates, C, 
open-habitat grasses developed the tolerance 
needed to survive frosts by 30 Ma (4). How- 
ever, although the evolutionary traits suited 
to open habitats appeared earlier, open-hab- 
itat grasses remained ecologically rare until 
later in the Cenozoic. 

Once grasses started spreading across the 
globe, their takeover was asynchronous and 
followed continent-specific trajectories. For 
instance, grassy habitats appeared in North 
America by 25 Ma but not until 7 Ma in Aus- 
tralia (5, 6). However, the first subtropical 
grassy biomes were unlike anything observ- 
able there today, featuring C, open-habitat 
grasses that today are found in colder regions 
(6). It was not until several million years 
later that tropical open-habitat C, grasses ex- 
panded to form grasslands and savannas at 
low to mid-latitudes (5, 7), roughly coincident 
with the spread of frost-tolerant grasses at 
higher latitudes. 

Grassy biomes thus emerged during the 
Cenozoic at different times in different places 
and, at least in part, for different reasons. 
Studies in modern grassy biomes suggest that 
aridity and rainfall seasonality, as well as fire 
and herbivory, could all favor grasses over 
trees (2), with even larger benefits at lower 
atmospheric CO, concentrations. The fossil 
record shows that many of these conditions 
did occur in the late Cenozoic. By 34 Ma, at- 
mospheric CO, levels had dropped, and the 
globe underwent a period of cooling. In many 
areas, altered atmospheric circulation and 
mountain uplift (e.g., of the Tibetan Plateau) 
resulted in aridification or seasonal drought, 
and fossil evidence indicates increased fire 
activity near the end of the Cenozoic (5). 
Further, large grassland-type mammal her- 
bivores (e.g., bovids) diversified during the 
mid- to late Cenozoic (8). 

Asynchrony in the emergence of grasses on 
different continents suggests that, although 
global factors such as low-CO, conditions 
may have spurred the diversification and 
expansion of open-habitat and especially C, 
grasses (7), changes in CO, were typically not 


enough to allow grasses to dominate. A rap- 
idly expanding geochemical and paleonto- 
logical tool kit has allowed for more detailed 
insights. Studies have shown that regional 
changes in climate and fire interacted with 
existing vegetation to influence trajectories 
of emerging grass dominance, with diver- 
gence across continents. For example, the 
earliest North American C, grassy habitats 
replaced forests as seasonal drought devel- 
oped (6), and in Australia, C, grasses favored 
by pronounced aridification overtook fire- 
adapted eucalypt woodlands that had existed 
there for tens of millions of years before (5). 
By contrast, in South Asia and southwest 
Africa, more frequent and intense wildfires 
promoted replacement of fire-sensitive vege- 
tation with grasses (9), suggesting a substan- 
tial regional, if not global, role for fire. 

In addition to environmental conditions, 
herbivores may also have directly contrib- 
uted to the spread of grassy vegetation, al- 
though the mechanisms are not yet under- 
stood. Defense strategies against herbivores 
by savanna trees, such as growing spines or 
thorns, evolved concurrently with the spread 
of grasses and the diversification of bovids in 
Africa (~17 Ma) but long before fire activity 
increased (8). This suggests that, at least in 
Africa, herbivores structured grassy biomes 
before fire did. However, just how important 
animals were in shaping the evolution of 
grassy vegetation remains untested and will 
require adapting methods of estimating past 
herbivore intensity (such as studying fungal 
spores in fossilized dung) for Miocene and 
older samples. 

Since they first appeared, grassy biomes 
have continued to shift in extent, structure, 
and composition, prompted by advancing 
and retreating ice sheets during the global Ice 
Age (2.6 Ma onward). Today, they are widely 
distributed on every continent except Ant- 
arctica, with a range in part associated with 
aridity and rainfall seasonality. Some 60% of 
grassy ecosystems receive <750 mm of annual 
rainfall, most with a dry season that shapes 
plant physiology. This provides a rationale 
for the argument that aridity drove late Ce- 
nozoic grassland expansion. However, 40% of 
grassy ecosystems extend into higher-rainfall 
regions with >750 mm of annual rainfall that 
can support forests. These moderately wet, or 
“mesic” grassy ecosystems are biogeographi- 
cally distinct from semiarid ones, but both 
are evolutionarily ancient (J). Yet, whereas 
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The history and legacy of grassy biomes 

Grassy biomes exist in a wide range of climates, from cold to hot and arid to wet (top). Although changing 
environmental conditions through time have shaped their past and present distribution, disturbance regimes 
(fire, herbivory) and vegetation histories also shaped their evolution and current and future function (bottom). 
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semiarid savannas are widely accepted as the 
native vegetation of large areas of the globe, 
mesic savannas were long assumed to rep- 
resent degraded forests. Only recently have 
mesic savannas been acknowledged for their 
contributions to endemic biodiversity and 
distinctive ecosystem function. 

As their antiquity is increasingly recog- 
nized, the ecological processes that promote 
mesic savanna stability have come into in- 
creasing focus (2). Fire likely plays an im- 
portant role in stabilizing mesic savannas, 
excluding forests by preventing tree estab- 
lishment or killing trees, thereby favoring 
grasses. In total, grassy biomes make up 
>80% of the global burned area annually. 
Experiments, field observations, and remote 
sensing analyses all support fire as a mecha- 
nism allowing grassy ecosystems to expand 
into mesic regions. Plant traits are consistent 
with the history of fire in mesic savannas. 
The distinct, diverse, and ancient tree and 
shrub communities (8) are well adapted to 
enduring fires with thick bark, large below- 
ground nonstructural carbohydrate reserves, 
and bud banks that promote resprouting. In 
addition to tolerating fire, many grasses ac- 
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tively spread fire (70). These fire adaptations 
have major implications for the ecosystem 
functioning of grassy biomes. For instance, 
the large belowground reserves in grassy 
biomes may mean a substantially larger be- 
lowground carbon storage compared with 
that in other biomes (/7). Current estimates 
suggest that grassy biomes hold at least 17% 
of global biomass carbon (72), but this is cer- 
tainly an underestimate (17) that needs to be 
adequately quantified so that the potential 
role of grassy biomes as carbon sinks can be 
fully appreciated. 

Herbivores that graze on grass and eat 
tree leaves also influence grassland function 
(13), especially in semiarid savannas, where 
grass eaters decrease grass biomass accu- 
mulation and tree eaters prevent trees from 
establishing. Abundant herbivory-related 
traits have accumulated over evolutionary 
time in grassland plants, including herbivory 
defenses in trees (e.g., spines) (8) and grass 
morphologies that withstand intense grazing 
(e.g., growing from the base instead of from 
shoot tips and bud banks for resprouting) 
(1). Nevertheless, the importance for grassy 
biome distributions of herbivory relative to 


other factors, such as climate and soil condi- 
tions, remains an open question. 

Overall, evidence is converging around 
the idea that grassy ecosystems are com- 
plex, with ecologies that depend not just 
on climate but also on interactions and 
feedbacks with fire and herbivory. These 
ecologies are profoundly influenced by the 
evolutionary history and resulting trait 
diversity of regional biota (7). Their com- 
plexity makes predicting the responses of 
grassy biomes to global change a particu- 
lar challenge. Nonetheless, studies have 
shown that the combination of CO, fer- 
tilization, fire suppression, and livestock 
extensification has resulted in widespread 
woody encroachment (/4) and associated 
degradation of grassy biomes—a trend that 
will likely continue into the near future. 

Grassy biomes are also threatened by 
ongoing land use conversions and degrada- 
tion while being among the least protected 
globally (2). For example, 90% of temper- 
ate grasslands have been transformed into 
agricultural or urban areas, with <1% of 
remnants currently protected from land 
development. Whereas rainforests in the 
Amazon have attracted widespread atten- 
tion from the popular media, the ongoing 
threat to savannas, especially in Africa, 
South America, and Asia from afforestation, 
fire exclusion, and land use conversion, has 
gone unnoticed. The effects on savanna and 
grassland biodiversity will be devastating; 
for instance, 40% of grassland vertebrate 
species are projected to be lost by 2070 
(15). Thus, the fate of evolutionarily ancient 
grassy biomes hangs in the balance, with 
terminal consequences for their function- 
ally and evolutionarily distinct biota. 
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Ancient grasslands guide ambitious goals in 


grassland restoration 
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Grasslands, which constitute almost 40% of the terrestrial biosphere, provide habitat for a great 
diversity of animals and plants and contribute to the livelihoods of more than 1 billion people worldwide. 
Whereas the destruction and degradation of grasslands can occur rapidly, recent work indicates that 
complete recovery of biodiversity and essential functions occurs slowly or not at all. Grassland 
restoration—interventions to speed or guide this recovery—has received less attention than restoration 
of forested ecosystems, often due to the prevailing assumption that grasslands are recently formed 
habitats that can reassemble quickly. Viewing grassland restoration as long-term assembly toward old- 
growth endpoints, with appreciation of feedbacks and threshold shifts, will be crucial for recognizing 
when and how restoration can guide recovery of this globally important ecosystem. 


rasslands are essential components of 

Earth’s system, supporting a biodiverse 

array of plants, birds, insects, and other 

animals and providing important eco- 

system services such as pasture forage, 
water regulation and freshwater supply, erosion 
control, pollinator health, and carbon seques- 
tration (J, 2). Yet high rates of land cover conver- 
sion for intensive agriculture and silviculture, 
combined with woody encroachment and spe- 
cies invasion driven by altered fire and grazing 
regimes, threaten these systems (3, 4). For in- 
stance, the Cerrado has been extensively cleared 
for agriculture, with more than half lost in the 
past 50 years, exceeding the rate of forest loss in 
the Brazilian Amazon (5). The Great Plains of 
North America has also lost more than half its 
original grasslands and continues to lose 2% 
each year (6). 

As we enter the United Nations Decade on 
Ecosystem Restoration, much of the emphasis 
has been on the restoration of forests (7). Iron- 
ically, this emphasis presents an additional 
threat to grasslands: Careless or poorly planned 
tree-planting efforts in the name of restoration 
can establish forests in natural grassland and 
savannah ecosystems. For instance, almost 
1 million km? of Africa’s grassy biomes have 
been targeted for tree planting by 2030 (8). 
This practice ignores the value of protecting 
and restoring grasslands. 

The conversion and degradation of grasslands 
can occur rapidly, yet restoring lost ecosystem 
services and diversity is often a discounted or 
underestimated challenge. Until recently, grass- 
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land assembly was assumed to be a relatively 
straightforward—albeit difficult—process (9): 
Allow herbaceous species to recolonize, at times 
augmenting with seed of native species; re- 
establish appropriate grazing and fire distur- 
bance regimes; and control ruderal, exotic, 
or woody species. Because many herbaceous 
species reach reproductive maturity in a few 
years, it was also assumed that this assembly 
process was relatively quick, achieving desired 
diversity and function within several years to a 
decade. We now know that this view of grassland 
restoration does not adequately acknowledge 
the difficulty of restoring biodiversity and func- 
tions or the time and interventions needed to 
restore grasslands (10). Here, we review recent 
developments that widen the view of grassland 
restoration to include grassland age and de- 
velopment, describe how this lens identifies 
important but overlooked restoration inter- 
ventions, and highlight several key unknowns 
for grassland restoration into the future. 


Refining the reference: The old-growth 
concept for grasslands 


Grasslands occur in a range of biogeographical 
contexts (Fig. 1) including the tropical and sub- 
tropical savannas in Africa, Australia, Asia, and 
South America; the boreal, temperate, and 
southern prairies in North America; and the 
steppes in Eurasia. Grasslands have a contin- 
uous herbaceous layer of graminoids and her- 
baceous dicots, either without trees or, in the 
case of savannas, supporting a range of tree 
densities with a continuous grassy understory 
(3) (Fig. 2). The processes creating and main- 
taining grasslands vary across locations (11); 
these include edaphic or climatic conditions 
and disturbances (i.e., herbivore grazing or 
fire), all of which can limit the establishment 
of woody species (Fig. 3). 

The reference condition is a cornerstone con- 
cept in ecological restoration; it encapsulates a 
set of desired characteristics and provides guid- 
ance for how to evaluate project success, even 
if a restored system is rarely able to completely 
reach reference conditions (72). In grasslands 
structured by edaphic or climatic conditions, 
with soils, low temperatures, or low precipita- 
tion constraining tree establishment, grassland 
is generally acknowledged to be the desired ref- 
erence state for restoration. In cases where cli- 
mate is suitable for forests but herbivore grazing 
or fire maintain them in an open state (J0) (Fig. 
3), more debate and uncertainty surrounds 
the reference designation. These disturbance- 
dependent grasslands are often assumed to be 
a result of deforestation (i.e., derived grasslands; 
grass-dominated vegetation resulting from 
human-caused deforestation) in an early succes- 
sional stage on a forest trajectory (Fig. 4). How- 
ever, climate suitability for tree growth does not 
preclude the likelihood that old-growth grass- 
lands exist (or used to exist) in the region (13). 


Fig. 1. The distribution of grasslands spans temperate and tropical regions of the globe. Green areas 
estimate the extent of grassland distribution. We note, however, that all maps of grasslands should be 

considered imprecise: Grasslands occur mixed within landscapes with other vegetation types and are often 
disturbed to an extent that masks historic distributions. Letters in black are grasslands represented in Fig. 2; 
letters in blue are grasslands represented in Fig. 3. 
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Moreover, these disturbance-dependent grass- 
lands are often at risk from factors driving 
woody invasion, rearranging landscape mosaics 
and shifting grass-forest boundaries (/4). If af- 
forestation policies under the guise of resto- 
ration disregard these dynamics, irreversible 
damage will occur (7). 

In forest ecosystems, old-growth forests are 
often used as references for restoration. These 
are mature forests composed of large and old 
trees, large snags, and a diverse tree commu- 
nity with structural complexity, all of which 
require long time periods to develop. Recent 
work has made it abundantly clear that the “old 
growth” concept is not limited to forests (4, 12): 
Old-growth grasslands, also called ancient or 
pristine grasslands, assemble over centuries and 


contain high species diversity, long-lived peren- 
nial plants, and a substantial proportion of well- 
developed belowground structure from which 
species can resprout after natural disturbance. 
Old-growth grasslands are unique in their un- 
derground structures and biodiversity: They 
store carbon and reallocate resources above- 
ground after disturbances and drought. All 
biogeographic contexts where grasslands are 
present (Fig. 1) support ancient old-growth 
grasslands that have persisted for millennia. 
As with old-growth forests, there should be 
little expectation that restored grasslands will ever 
completely recover to resemble old-growth grass- 
lands. Even so, old-growth grasslands provide a 
suite of characteristics that can be the aim in 
restoration: long-lived perennial plants; a com- 


plex diversity of belowground structures that en- 
able resprouting after aboveground disturbances 
such as fire and grazing; and substantial below- 
ground carbon stores. Traditional management 
can usefully target these old-growth character- 
istics even in cultural landscapes where grass- 
lands are created and maintained by human 
activity, and regardless of historical analogs (15). 

With maps of grasslands contested and over- 
lapping those of forests (8, 13), it can be chal- 
lenging to determine whether a grassland is one 
that formed after the degradation of an old- 
growth grassland (i.e., a secondary grassland; 
grass-dominated vegetation resulting from the 
degradation of old-growth grasslands) or a de- 
rived grassland formed after deforestation. 
Paleoenvironmental methods, considering 


Fig. 2. The incredible diversity of old-growth grasslands. See Fig. 1 for 
locations. Whether these grasslands are maintained by disturbance (such as 
grazing or fire) or are environmentally constrained (EC, edaphic or climatic; 
see Fig. 3 for details) is indicated within brackets. (A) California coastal 
grasslands on Mount Tamalpais, USA (disturbance). (B) Curtis Tallgrass Prairie 
Restoration, Wisconsin, USA (disturbance). (C) Longleaf pine (Pinus palustris) 
savanna, North Carolina, USA (disturbance). (D) Grassland in the Espinhaco 
mountain range, Minas Gerais, Brazil (EC, edaphic + disturbance). (E) Subtropical 
grasslands in Rio Grande do Sul, southern Brazil (disturbance). (F) Alpine 
meadow in the Alps, Vanoise National Park, France (EC, climatic). (@) A high- 
ole National Park, Ghana (disturbance). (H) The 


rainfall grassy savanna in 
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Serengeti ecosystem in Tanzania (EC, edaphic + disturbance). (I) The grasslands 
in the Kavango Catchment, Angola (EC, edaphic and climatic + disturbance). 
(J) Grassland in the Drakensberg, South Africa (disturbance). (K) Grassland and 
tapia savannas on Ibity mountain, Madagascar (disturbance). (L) Petrophytic 
steppe in Khakassky Zapovednik State Nature Reserve, Russia (EC, climatic). 
(M) Eravikulam Shola grasslands, India (EC, climatic + disturbance). (N) Oak 
savanna in South Yunnan, YuanJiang region, China (disturbance). (0) Mesic 
savanna in the Northern Territory, Australia (disturbance). These grasslands vary 
widely in composition and structure yet share key characteristics that can guide 
restoration: high belowground allocation, complex resprouting structures, and 
unique functional and taxonomic diversity. 
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lengthy records of pollen, phytoliths, charcoal, 
and Sporormiella fungi specific to herbivore 
guts, can provide evidence for past grasslands 
and their disturbance history (16). Species com- 
position and functional diversity (e.g., of below- 
ground structures), as well as phylogenetic 
studies dating the origins of endemic grass- 
land species, can also indicate antiquity and 
conservation value (17, 18). There are also con- 
texts where grasslands are the desired eco- 
system state for cultural or social reasons despite 
being created or maintained by humans. 


Pathways and thresholds of grassland degradation 


Grasslands are increasingly degraded by land- 
use change and altered disturbance regimes, 


A High 
Disturbance-dependent 
old-growth grasslands 


_— Historical disturbance frequency ————_» 


Forests 


° 
= 


which can fundamentally alter their structure 
and functioning (Fig. 4). Such degradation in- 
creases the need for grassland protection and 
restoration but can also decrease the capacity of 
restoring old-growth grassland characteristics. 
Grazing and fire are dominant aboveground 
disturbances that have coevolved with grass- 
land plants, maintaining diversity and function 
in grasslands (4). Changes to these disturbance 
regimes can gradually alter grasslands. Although 
this results in the loss of biodiversity and sim- 
plification in composition, structure, and func- 
tioning, altered grassland often maintains some 
belowground structures (Fig. 4). Lack of grazers 
(or of particular suites of grazing species) can 
homogenize grasslands and increase fire occur- 


Fig. 3. Interactions among climate, soils, disturbance, and vegetation are key considerations for under- 
standing old-growth grasslands as well as recovery trajectories in secondary grasslands. (A) On most soil 
types, the existence of disturbance-dependent grasslands (in light rose-color) is determined by interactions 


between soils and endogenous disturbances (fire, herbivory). Tree recruitment is limited by these disturbances. In 
environmentally constrained grasslands (in light brown), poor drainage (Seasonally saturated or inundated soils), 
extremely low moisture-holding capacity (shallow, rocky soils), exceptionally low soil fertility, cold temperature, or 
ow precipitation preclude dense tree cover, even in the absence of frequent disturbances. Disturbances and abiotic 
factors (circles, in no set order) that could result in exclusion of trees are placed as examples in each of the far 
eft zones, respectively. In forests (dark green), dense tree cover constrains fire frequency and grazer abundance by 
imiting herbaceous plant productivity. The light green state space between disturbance-dependent old-growth 
grasslands and forests represents unstable vegetation (fire-excluded, tree-encroached grassland) in transition 
between alternative ecosystem states; old-growth grasslands and forests often co-occur in mosaics in such land- 
scapes. (B to D) Examples of grasslands structured by different interactions. (B 


Bison grazing in Konza prairie, 


where fire is needed to suppress woody encroachment. (C) Water saturation of the soil prevents tree establishment 
and fire maintains diversity in this wet grassland in Jalapao, Northern Brazil. (D) Sheep grazing in a Mediterranean 
grassland in Southern France, where pastoralism has coevolved with the system in a grassy state since the Holocene. 


596 5 AUGUST 2022 + VOL 377 ISSUE 6606 


rence. On the other hand, overgrazing, particu- 
larly in grasslands with no evolutionary history 
of grazing, can result in loss of basal cover, soil 
compaction, and increased erosion (9). Defin- 
ing the degradation point in these circumstances 
is difficult; for instance, naturally occurring 
“grazing lawns” have many of the biophysical 
characteristics associated with degradation 
(ow aboveground biomass, soil compaction, 
sometimes even increased bare ground) even 
though their unique biodiversity and ecologi- 
cal importance is now increasingly recognized. 
Fire regimes can also become too frequent or 
infrequent or occur during the wrong season. 
The longer these altered disturbance regimes 
persist, the more risk to belowground structure 
(e.g., bud banks) that speed recovery. Altered 
disturbance regimes can also facilitate exotic 
grass invasion and woody encroachment (20), 
which can compound impacts to belowground 
structure over time. 

The most detrimental disturbances are those 
that rapidly destroy belowground structure, 
such as tillage agriculture, mining, and affor- 
estation (10, 21). For instance, 50 years of pine 
plantation completely eliminated the viable 
bud bank in a once-open savannah (22). Several 
decades after cultivation or mining, the compo- 
sition of secondary grassland plant commun- 
ities remains very different from that of nearby 
old-growth grasslands, lacking species with poor 
dispersal abilities and species regenerating from 
belowground organs (10, 23). Belowground 
degradation can therefore cause grasslands to 
cross a hard-to-reverse threshold where resto- 
ration may be difficult or impossible within 
decades of these disturbances. Given the ap- 
parent existence of this threshold, it is vital that 
remaining old-growth grasslands are protected, 
particularly from the threats that affect below- 
ground processes and structure, as we cannot 
rely on restoration to guide complete recovery 
after such degradation. 


Interventions toward old-growth characteristics 


In contrast to the early successional view of de- 
rived grasslands as a stage on their way to forests, 
restoring old-growth characteristics to altered or 
secondary grasslands requires attention to the 
development of a complex belowground struc- 
ture akin to the aboveground complexity in an 
old-growth forest (24). A synthesis of 31 studies, 
including 92 time points on six continents, in- 
dicates that secondary grasslands may typically 
require at least a century, and more often mil- 
lennia, to recover their former species richness 
(23). Even as their richness increases over dec- 
ades to centuries, these grasslands still lack 
many characteristic old-growth grassland spe- 
cies and instead support more short-lived, early 
successional species than their old-growth 
counterparts. We know less about the timeline 
for belowground soil and structure develop- 
ment, but it likely corresponds with the timeline 
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of these compositional dynamics (25). The in- 
creased appreciation of the temporal dimension 
of grassland assembly emphasizes the need of 
restoration to accelerate this trajectory and chal- 
lenges the view that one initial period of active 
restoration will be sufficient to guide develop- 
ment. We highlight three advances driven by 
this increased appreciation below. 


Focus interventions on 
disturbance-vegetation feedbacks 


In cases where degradation has not had a cat- 
astrophic impact on belowground structure, it 
may be possible to reestablish broken feedbacks 
that then can guide recovery (26). Feedbacks 
among disturbance, vegetation, and below- 
ground soil development have structured grass- 
lands for millennia (4, 27). Disturbance regimes 
select for functional traits of the vegetation, 
which then provide feedback to affect the in- 
tensity, frequency, and impact of disturbances 
(28). For instance, fire regimes vary in flamma- 
bility depending on plant properties, and herbi- 
vore pressure varies depending on the quantity 
and quality of forage and habitat suitability 
for predator avoidance (27). The response of 
vegetation to these disturbances varies based 
on plant traits such as resprout ability, clonal 
growth, and seed recruitment (26, 28). Feed- 
backs also extend to soils and soil organisms, 
as soils determine plant growth but are also 
products of the plants that grow on them (29). 

As feedbacks in degraded grasslands differ 
in their nature and strength from those with 
more old-growth characteristics, reestablishing 
a disturbance regime in degraded grasslands 
may not result in expected effects of the distur- 
bance or in the intended vegetation responses to 
the disturbance. Interventions simultaneously 
addressing disturbance and biota may be the 
best option to break the feedbacks that constrain 
recovery. For instance, there are examples of 
creative use of prescribed fire as a tool to re- 
create grazing habitat (30), and livestock can be 
managed in such a way as to initiate grazing 
habitat that supports large mammalian herbi- 
vores (37). Amendments such as biochar and 
mycorrhizal inoculum can shift the soil envi- 
ronment to be more suitable for native species, 
characteristics which can be maintained by slow 
growth and resource cycling of the vegetation 
(32, 33). As the system recovers, these inter- 
ventions also need to shift depending how the 
recovering biota affects disturbance dynamics 
and vice versa. 


Breaking the cycle of invasion: Vegetation 
change that constrains recovery 


Restoration in areas where an altered distur- 
bance regime has resulted in woody encroach- 
ment or exotic herbaceous species invasion 
demonstrate the importance of viewing resto- 
ration as a set of interventions that iteratively 
move the system to a new system state (10, 34). 
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Fig. 4. Degradation pathways can result in differential loss of ecosystem function and diversity to old- 
growth grasslands, and the recovery of “old-growth” characteristics is dependent on the degree of func- 
tional change. Axes of functional and compositional change depict divergence from the reference characteristics 
[modified from (23)]. (A) The trajectory of recovery in restored grasslands (blue spheres) toward old-growth 
characteristics (lower right) is dependent on the degradation pathways (red arrows, ranging right to left from altered 
disturbance regimes to land use conversion) as well as vegetation-soil-disturbance feedbacks (black arrows) at 
each stage of recovery. Substantial belowground disturbance (e.g., tilling) may cause the system to cross a hard-to- 
reverse threshold (gray line) and woody encroachment shifts feedbacks and can lead to alternative trajectories. 
Iterative restoration interventions (dashed black arrows) that consider these feedbacks can result in progression 
back toward old-growth characteristics. (B) Forests show similar dynamics, where recovery to old-growth 
characteristics after deforestation may be hard if not impossible. An early recovery stage after deforestation may be 
a grassy stage (which we term a derived grassland), yet the recovery trajectory is toward forest. Restoration 


interventions may accelerate recovery. 


Woody species can strongly influence distur- 
bance regimes, and land managers have re- 
sorted to cutting, herbicides, and even plowing 
to remove trees—with striking consequences for 
the remaining biodiversity. Extreme fires (fire- 
storms) have been applied in heavily encroached 
areas using spiral ignitions or extreme weather 
days to try to reverse the woody cover and re- 
initiate ecologically relevant feedbacks (35). 
Once the grassy understory has been reduced to 
the point that it cannot carry a fire or support 
grazers, woody encroachment becomes more 
difficult to reverse (36), requiring the replanting 
of herbaceous vegetation alongside the initiation 
of disturbance regime for recovery feedbacks. 
When invasive species are grasses, they can 
often maintain disturbance regimes that benefit 
short-lived ruderal life histories, preventing tran- 
sitions to the belowground complexity and al- 
location that characterize old-growth grasslands 
(37). High accumulation of litter and standing 
dead biomass changes local fire behavior, and a 
dependence on seed recruitment often confers 
advantage for invasives under this disturbance 
regime (38). Dominance in the seed bank and 
difficulty reestablishing long-lived natives can 
make this feedback particularly difficult to ad- 


dress. One strategy is to enhance the ability for 
natives to recruit by seed via seed enhancement 
technology (e.g., seed coating or pelleting aimed 
at mitigating the conditions that limit estab- 
lishment) (20), potentially addressing priority 
effects (i.e., the order in which plants are re- 
introduced) that influence species dominance in 
early stages of restoration (39). 


Overlooked old-growth grassland species 


One important restoration question is how to 
accelerate or facilitate species turnover toward 
old-growth species composition and _ associ- 
ated belowground function. Worldwide, grass- 
lands are often restored by sowing seeds (40). 
However, as many species have developed colo- 
nization and survival strategies that are based 
on belowground buds and clonal growth (23, 41) 
rather than on seeds, additional techniques may 
be needed to restore old-growth characteristics. 
Seeding fast-growing species can impede long- 
term restoration success by creating commun- 
ities with low resilience to natural disturbance, 
such as fire, and excluding the longer-lived spe- 
cies from restoration (42). In fact, there may be 
many grasslands where seeded species main- 
tain dominance long after restoration, spurring 
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reconsideration of whether actions are achiev- 
ing the desired old-growth structure (43). 
Although bud-bearing belowground organs 
can persist in the absence of disturbance for 
some time in a degraded grassland (44), how 
long is still unclear. Once these belowground 
structures are gone, we have little understanding 
of how to reintroduce this component of the 
vegetation (24). Topsoil transfer has shown some 
success in broadening the type of species that 
restoration can reintroduce (45), yet even this 
technique favors species with high seed bank 
allocation. Vegetative propagation—such as 
micropropagation, transplantation of seedlings, 
and individual tillers—is often needed (24) but is 
hard to conduct at scale, with open questions 
about protocols, spatial configuration of planting, 
and genetic sourcing. Techniques aimed at speed- 
ing the establishment of bud banks and below- 
ground organs in a restoration have shown 
promise but are just in their infancy (24, 41). 


Global change as a challenge and opportunity 


Global climate change frames the emerging per- 
spective of long-term assembly toward old- 
growth characteristics in grassland restoration. 
Climate controls the distribution of grasslands 
in some regions, influences the feedbacks and 
threshold shifts that determines where grass- 
lands persist, and, in virtually all regions, can 
have a strong influence on the interventions 
needed to restore feedbacks (14, 46). Depend- 
ing on the degree to which climate influences 
these processes, it may also affect the historical 
approach to the determination of grassland 
types and disturbance regimes (72). For in- 
stance, changes such as elevated atmospheric 
CO., which exacerbates invasion of woody spe- 
cies, would require novel disturbance regimes to 
aim for a grassy state. 

Given the strong feedbacks between compo- 
sition and disturbances in grassland recovery, 
shifts in climate may exert large influences on 
the assembly process. In some cases, it may be 
important to let climate effects shift restoration 
trajectories, as climate can guide species com- 
position or characteristics to those most able to 
tolerate future conditions (47). Restoration ef- 
forts under a climate change scenario may thus 
target not only which species should be present 
at a given site, but also functional diversity, soil 
structure, and the belowground component. In 
this way, the system may be able to recover from 
an extreme event, as the presence of a viable bud 
bank and underground storage organs ensures 
the resilience of the system (48). However, letting 
climate effects shift restoration trajectories might 
also be undesirable if it endangers fundamental 
feedbacks in the trajectory of the system toward 
old-growth functional characteristics (46) by, for 
instance, selecting for species with greater above- 
ground allocation characteristics. As below- 
ground complexity is a characteristic that develops 
over long time horizons, understanding how 
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climate influences priority effects and feedbacks 
that affect recovery trajectories is critical. 
Climate change will add difficulty to the al- 
ready difficult challenge of restoring old-growth 
grasslands that resemble specific reference sites, 
as these ancient grassland references developed 
in a different time, disturbance regime, and cli- 
mate. Yet we expect that restoring old-growth 
characteristics in these situations, prioritizing 
processes such as belowground complexity and 
functional diversity (49), should enable resilience 
and facilitate adaptation to future change while 
still maintaining character, functions, and services 
that embody these globally important systems. 


Outlook 


As we enter the United Nations Decade on Eco- 
system Restoration, advances in restoration sci- 
ence and practice in grasslands are critical if we 
are to combat the loss of old-growth grasslands 
and the decline of biodiversity (50). However, in 
the rush to provide nature-based solutions to 
tackle climate change, tree planting in grasslands 
has become synonymous with restoration in 
many regions (13). At the same time, the high 
demand for arable land continues to spur con- 
version to agriculture. These are irreversible ac- 
tions, ignoring the belowground soil-locked 
carbon storage in these old-growth grasslands 
as well as the hard road to restore their below- 
ground complexity and their biodiversity once 
they are lost. 

Although there are many challenges ahead, 
viewing grassland restoration as assembly 
toward old-growth characteristics with unique 
biota and belowground complexity will enable 
us to achieve ambitious restoration goals for 
Earth’s grassy ecosystems. Given that grassland 
recovery involves strong feedbacks among veg- 
etation, disturbance, and soils, as well as the 
lengthy time horizon for recovery, future prog- 
ress depends on creative interventions that focus 
on iterative management, taking into account 
changes in grassland assembly over time. Tech- 
niques to reestablish species characteristic of 
old-growth grasslands, given their belowground 
structure and limited recruitment by seed, will 
require looking beyond or augmenting traditional 
seeding techniques. Metrics of belowground com- 
plexity and functional diversity will be critical 
guideposts to track trajectories in development 
and assess success. We urge conservation initia- 
tives to safeguard against the conversion of old- 
growth grasslands for tree planting or tillage 
agriculture, to maintain our ancient biodiverse 
grasslands with appropriate disturbance regimes, 
and to emphasize the long-term restoration of 
grasslands in efforts to restore Earth’s biodiversity. 
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REVIEW 


Molecular, cellular, and developmental foundations 


of grass diversity 


Paula McSteen** and Elizabeth A. Kelloge?* 


Humans have cultivated grasses for food, feed, beverages, and construction materials for millennia. 
Grasses also dominate the landscape in vast parts of the world, where they have adapted 
morphologically and physiologically, diversifying to form ~12,000 species. Sequences of hundreds of 
grass genomes show that they are essentially collinear; nonetheless, not all species have the same 
complement of genes. Here, we focus on the molecular, cellular, and developmental bases of grain yield 
and dispersal—traits that are essential for domestication. Distinct genes, networks, and pathways were 
selected in different crop species, reflecting underlying genomic diversity. With increasing genomic 
resources becoming available in nondomesticated species, we anticipate advances in coming years that 
illuminate the ecological and economic success of the grasses. 


ost people reading this review will have 
either eaten, stepped on, or burned a 
grass within the past 24 hours. Hu- 
mans have been cultivating grasses 
for at least 10,000 years and likely 
consumed them for millennia before that. Of 
the crops that feed the world, the big three— 
wheat, maize, and rice (Fig. 1)—provide 50% 
of calories consumed by humans as well as 
protein and micronutrients, are grown over 
the widest area, and have the highest eco- 
nomic value (7). In addition, so called “orphan 
crops,” such as tef, sorghum, fonio, and various 
millets, most of which are native to Africa, 
grow well with less intense agricultural inputs 
and are poised to be cultivated more widely to 
serve a warmer, drier planet. Meat, eggs, and 
dairy products are the products of animals that 
consume forage, pasture, and prairie grasses. 
Moreover, some of the most devastating 
agricultural weeds, such as Johnson grass in 
corn fields and barnyard millet in rice fields, 
are grasses (2). Grasses also underpin the bev- 
erage industry; the world wouldn’t have beer 
without barley (Fig. 1) or rum without sugarcane, 
with the latter being used to produce not 
only sugar but also biofuel (7). Turf grasses 
beautify cultivated landscapes and provide 
the playing surface for golf courses, tennis 
courts, cricket pitches, and other sports fields. 
Grasses such as Miscanthus and switchgrass 
are being developed for lignocellulosic biomass, 
and perennial grasses, such as intermediate 
wheatgrass, may help store carbon below 
ground. Bamboos (and even giant reeds) are 
used for construction. Yet despite this diverse 
repertoire, only a small subset of the ~12,000 
species of grasses are used by humans (2, 3). 
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Like orchids, lilies, asparagus, and pineapples, 
the grasses (family Poaceae or Gramineae) have 
a single seedling leaf (cotyledon) and are placed 
in the large clade of monocotyledonous flower- 
ing plants (monocots). The grasses constitute 
~20% of the ~60,000 species of monocots (4). 
Thus, all grasses are monocots, but most mono- 
cots are not grasses. Grasses that produce seed 
that is cultivated agronomically and eaten by 
humans and animals are often called cereals. 


Morphological and physiological diversity 


Grasses are ecologically dominant in vast areas 
of all the continents except for Antarctica (2, 5). 
Even in areas with some tree cover, grasses 
form a broad understory. The grass family may 
have originated more than 80 million years 
ago, extending its continental reach during the 
late Miocene grassland expansion (8 million to 
3 million years ago), although its current dis- 
tribution also reflects extensive climatological 
change since then (5, 6). 

Broad physiological adaptations permit grasses 
to thrive in disparate environments. Most grasses 
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are tropical, but one major group, subfamily 
Pooideae, has spread widely in cool and cold 
areas, even reaching Antarctica (7). Whereas 
some genetic components of their cold tol- 
erance are widely shared stress responses, 
others represent the repurposing of loci in- 
volved in other physiological responses (8). 
Such loci include ones that regulate the in- 
duction of flowering after cold (vernalization), 
as in winter wheat (9). Among the tropical 
grasses, high-efficiency (C4) photosynthesis 
has originated 22 to 24 times (10), with the 
physiological and anatomical bases of the path- 
way being subtly different each time. Our cul- 
tivated cereals are mainly annual, grown for 
their ability to complete their life cycle (seed to 
seed) in one growing season, but most species 
of grasses are perennial (2). The genetic mech- 
anisms underlying the shift from perennial to 
annual are unknown but are likely diverse (17). 


Genomic diversity 


The genomes of grasses are largely collinear 
for all species in the family, that is, the genes 
are in roughly the same order (72). This broad 
similarity allows genes identified in one spe- 
cies to be discovered in a second species, 
permitting the grasses to function as “a single 
genetic system” (12). All grass genomes also 
share large regions of duplicated genes, which 
points to a polyploidization event in the com- 
mon ancestor of the family [e.g., (13, 14)]. 
Polyploidization events have continued to oc- 
cur frequently throughout the evolution of the 
family, with some authors estimating that as 
many as 75 to 80% of the species are recent 
polyploids (15). 

Beneath this broadly conserved genome ar- 
chitecture lurks extensive diversity, including 
variation in nucleotides (single-nucleotide poly- 
morphisms), gene structure, and even the pres- 
ence or absence of genes [e.g., (16, 17)]. The 
nucleotide differences between two lines of 


Fig. 1. Diversity of grass inflorescence morphology. (A) In wheat, the unbranched spike produces 
single spikelets (inset) with multiple florets. (B) In barley, the unbranched spike produces triplet spikelets 
(inset). In this two-row variety, only the central spikelet produces a floret. (©) Rice has many branches 
and produces single spikelets (inset) with a single floret. (D) Maize produces many branches with paired 


spikelets (inset) that each produce two florets. 
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Zea mays (maize) are greater than those be- 
tween humans and chimpanzees (78). Genes 
central for plant structure in maize are missing 
in wheat and rice, and vice versa (19). In other 
words, not all grasses have the same comple- 
ment of genes, and their morphology is altered 
accordingly. 


The grain: A grass-specific structure 


The grass fruit (grain or caryopsis) is the in- 
novation that characterizes all grasses (2) 
(Fig. 2A). The grain develops from fusion of 
the single seed to the inner wall of the ovary, 
creating a single solid structure. The wheat 
“seeds” sold in the grocery store are in fact 
grains, with the bran made up of the ovary 
wall plus the seed coat. Within the grain is 
the young embryo (the germ, in wheat), which 
is a well-formed little plant with multiple leaf 
primorida and shoot and root apical meri- 
stems (the stem cells that give rise to all organs 
in the plant) (Fig. 2B). Development of the 
grass embryo progresses a long way before 
the fruit is shed from the plant, distinct from 
that in other closely related monocot families in 
which the embryo is a globular, scarcely differ- 
entiated mass of cells at fruit maturation (2). 

The grain and the inflorescence that bears 
it have been the focus of both natural and 
human selection for grain size and number 
and dispersal. The starch-filled endosperm and 
oil-filled embryo of grains made wild grasses 
an obvious source of food for human ances- 
tors. The early process of converting these wild 
species into ones that could be cultivated 
year after year is well known and is described 
in many biology textbooks. Traits in this fam- 
iliar “domestication syndrome” may include 
(i) cultivated plants with grains that are 
larger than those of their wild ancestors 
and do not drop off the plant, (ii) lack of 
dormancy, (iii) loss of awns (wheat, sorghum, 
oat, rice), and (iv) increased grain number. 
We will focus on the developmental, cellular, 
and molecular bases of two of these traits: 
failure of seed drop (called “loss of shatter- 
ing”) and grain number. 


Shattering: Useful in the wild, a liability 
in cultivation 


An early step in grass domestication is selec- 
tion for mutations that let the plant hold onto 
its seeds rather than drop them in the dirt. The 
annual cycle of reaping and planting automat- 
ically selected for grains that were held more 
firmly than those in wild undomesticated plants 
and, over time [possibly ~1000 years (20)], led 
to domesticated plants in which the flower 
stalks fail to break easily, so-called nonshatter- 
ing varieties. Lack of shattering was selected 
independently in most known domestication 
events in cereals (27). 

The close relationship and genomic similar- 
ities among the cereal crops suggested that 
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Fig. 2. Grains characterize all grasses. 

(A) Photograph of a maize grain indicating the 
starch-filled endosperm and the seed coat. The 
scutellum (part of the first leaf or cotyledon) and 
the embryo are visible on the adaxial side of the 
kernel. (B) Diagram of a longitudinal section through 
the well-developed maize embryo indicating the 
coleoptile and scutellum (which make up the first 
leaf or cotyledon), multiple leaf primordia, the 
shoot apical meristem (SAM), and the root apical 
meristem (RAM). 


perhaps loss of shattering in wheat, sorghum, 
rice, and others could have occurred by re- 
peated modifications of the same underlying 
genes. However, a series of quantitative ge- 
netic locus studies (22) and subsequent studies 
that looked at the expression of genes involved 
in forming the break point itself (abscission 
zone) have found extensive differences among 
the crops (23). Genes that are mutated in do- 
mesticated wheat (brittle rachis 1 and 2) are 
unrelated to those in rice (shattering4 and 5), 
which in turn are distinct from those in 
sorghum and millet [e.g., less shattering1, 
which is reviewed in (24)]. The one exception 
may be a locus known as shattering! (shi) in 
sorghum (25), which is also mutated in do- 
mesticated rice and foxtail millet (22, 26). shi 
is a transcription factor in the YABBY family 


(named for the distinctive DNA binding do- 
main known as a yabby domain), but its pre- 
cise molecular function remains unknown. 

Spontaneous reversal of domestication, in 
which shattering has been reacquired inde- 
pendently, has created grasses that grow as 
weeds within the crop; such dedomestication 
has been documented in at least four lineages 
of rice, as well as in a few other grasses [Te- 
viewed in (27)]. The underlying domestica- 
tion mutations are still present in the newly 
weedy rice, but the weedy populations have 
additional mutations that lead to shattering, 
each using different sets of genes (28). 

Even among wild grasses, shattering appears 
to occur by different mechanisms, which may 
explain the distinct sets of mutations in the 
different domestication events. The break point 
forms in different positions in different lineages 
of grasses (2). Breakage occurs below the flower 
(often called a floret) in many species, such that 
the grain is shed along with floral organs and 
subtending bracts, but in other grasses (includ- 
ing the many species of millet), breakage occurs 
below the clusters of flowers (called spikelets) 
so that several flowers fall off the plant at once 
(Fig. 3). In still other species such as wheat and 
barley relatives, the inflorescence stalk breaks 
up. Cellular details and cell wall structures also 
differ among species, but the cell wall differ- 
ences do not correlate with the location of the 
abscission zone or with evolutionary relation- 
ships (29). 

Specific sets of genes characterize the abscis- 
sion zones of rice, Brachypodium, and green 
millet, but the abscission-specific genes are 
almost completely nonoverlapping (23). Only 
two, a MYB transcription factor and a lysine 
decarboxylase, are specific to the abscission 
zone of all species (23). shJ is commonly up- 
regulated in the abscission zone but is also 
expressed more widely, suggesting that its 
function in the abscission zone is part of a 
larger spikelet developmental network. 

Despite years of investigation, the precise 
process of shattering in grasses remains un- 
known. Most of the genes that affect the 
process are transcription factors, often from 
well-known gene families that affect other 
aspects of plant development. One compelling 
hypothesis is that the process of shattering is 
not a single mechanism but rather a set of 
mechanisms that have evolved over time. 


Grain yield: A diversity of mechanisms 


Because a grass flower (floret) produces only 
one grain (at most), the number and arrange- 
ment of flowers directly affect the yield. The 
number of grains is thus affected by the num- 
ber of flowers per spikelet, the number of 
spikelets per branch, and the number of 
branches per inflorescence, all of which vary 
among species (Fig. 4). Furthermore, many spe- 
cies have inflorescences that top vegetative 


science.org SCIENCE 


PHOTO: PAULA MCSTEEN 


branches, called tillers, further contributing to 
grain yield. Complicating the picture, grain weight 
and number of grains are generally inversely 
correlated, so simple selection for more grains 
leads to more smaller grains (30). Because of 
the complexity of how flowers are produced, 
increased grain number can be achieved by 
any number of different mechanisms. 

Domestication and postdomestication breed- 
ing of cereals have led to an increase in the 
number of grains produced (increased yield) 
compared with that produced by the wild an- 
cestor. For example, hybrid maize bears 16 to 
22 rows of kernels around the circumference 
of the cob, substantially more than the wild 
ancestor teosinte, which bears only two rows. 
The number of rows is always an even num- 
ber because maize produces its spikelets in 
pairs (Fig. 4), as do all other members of the 
tribe Andropogoneae, including sorghum, 
sugarcane, and Miscanthus (2). Paired spike- 
lets have also arisen independently in the 
related tribes Paspaleae (e.g., seashore paspalum) 
and Paniceae (e.g., fonio, crab grass). But how 
do these grasses produce pairs in the first 
place? The vast majority of grasses, like rice, 
produce spikelets singly, although another 
cereal, barley, produces spikelets in triplets. 
Wheat produces spikelets singly, but muta- 
tions can cause the formation of paired or 
triple spikelets, indicating that wheat has the 
underlying genetic capacity to produce ad- 
ditional grain. Could understanding these 
mechanisms be used to increase grain num- 
ber in cereals or in grasses or orphan crops to 
be domesticated in the future? 

Multiple mechanisms have led 


Break point within the spikelet 


in different cereals (19). We discuss just two of 
these pathways below. 

In maize, expression of ramosa2 (ra2), which 
encodes a transcription factor in the lateral 
organ boundary (LOB) domain family, acts 
upstream of ramosal (ral), which encodes a 
zinc-finger transcription factor and controls 
the abrupt switch from producing branches 
to producing spikelet pairs [reviewed in (19)]. 
The expression pattern of ral in Miscanthus 
and sorghum, both of which also produce 
spikelets in pairs and are in the same clade as 
maize, also correlates with the branch-to- 
spikelet pair transition, albeit later, correlat- 
ing with an increased number of branches 
(31). However, ral is not found in rice, barley, 
wheat, or other members of their subfamilies 
that do not produce spikelets in pairs (19). 
Conversely, mutations in the ortholog of ra2 
in barley also increase branching and are asso- 
ciated with phenotypic differences between 
two-row and six-row barley (Fig. 4) (9). There- 
fore, the genetic network regulated by ra2 
differs between major groups of grasses even 
though the protein itself is conserved. 

Recent progress has been made in under- 
standing the genetic basis for the unbranched 
spike morphology in wheat and barley by the 
composituml (com) and com2 loci [reviewed 
in (32)]. Whereas com! orthologs do not regu- 
late branching in maize and rice (33), the 
function of com2 appears to be somewhat 
conserved. com2 mutations increase branching 
and spikelet number in barley and cause the 
production of paired or, rarely, triple spikelets 


to the variation in inflorescence, 
morphology observed in grasses 
during evolution and domestica- 
tion (Figs. 1 and 4). Determina- 
tion of the molecular, cellular, and 
developmental bases of these pheno- 
types indicates that similar pheno- 
types in one species can be caused 
by different pathways or that or- 
thologous genes can cause differ- 
ent phenotypes in different species. 
In the following sections, we dis- 
cuss three mechanisms involved 
in the diversity of morphology in @ 
cereal grasses. y 


Rice 


To branch or not to branch 
stamens 


Multiple genetic pathways control 
branching in grass inflorescences 
(79). Mutations in these pathways 
can lead to increased branching and 
increased grain number, so it is not 
surprising that these pathways have 
been selected in the evolution, do- 
mestication, and breeding of culti- 
vated cereals. However, a variety of 
different pathways have been used 
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Fig. 3. Diagrams of spikelets (rice, millet, tef) and inflorescence (wheat, 
barley) showing different positions of break points (abscission zones). 
Modified leaves known as glumes (dark green arcs) mark the base of the 
spikelet and provide critical positional landmarks for comparisons. The break- 
point position above the glumes, as in rice, is common and ancestral in the 
grass family (23). The position below the glumes, as in millet, predominates in 
the subfamily Panicoideae. Few grasses break right below the grain, as in teff. 
The breakable inflorescence stalk is common not only in wild relatives of 
wheat and barley but also in maize. 


Break point in 
inflorescence stalk 
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== 


in “miracle wheat,” which is so called because 
of its increased grain yield (32, 34). Mutations 
in the orthologous gene also increase branch- 
ing in maize and rice, but the additional 
spikelets do not produce florets and are sterile, 
and hence do not increase yield (35). However, 
mutations in the promoter of the rice ortho- 
log, which cause reduced rather than complete 
loss of function, increase spikelet number and 
yield and thus may be valuable for breeding 
(36). Evolutionary analysis of com2 orthologs 
identified signatures of selection at particular 
amino acids in rice, wheat, and barley (37), 
although their functional importance remains 
to be determined. 


Growth suppression 


Another mechanism for altering branching 
would be to suppress the outgrowth of struc- 
tures that have already been formed. For 
example, increased expression of several tran- 
scription factors, including teosinte branched1 
(tbl) and grassy tillers (gt1), has led to the 
suppression of tiller buds during domestica- 
tion in maize, and these transcription factors 
are proposed to have conserved roles in reg- 
ulating tiller number in wheat and rice [re- 
viewed in (35)]. tb] and gti have been used 
repeatedly in cereals for different purposes 
other than tiller number. For example, gti 
was co-opted in sex determination in maize 
(38), and orthologs of tbi or gti are used in 
the suppression of spikelets in two-row barley 
(35). Furthermore, in wheat, loss of function 
of tbI and interactions with flowering-time 
genes cause production of paired 
spikelets (39, 40). Thus, changes in 
expression (or the targets) of tran- 
scription factors that cause growth 
suppression could be very powerful 
in causing phenotypic changes. 


Meristem size matters 


One mechanism to increase grain 
number in maize and rice is to 
increase the size of the apical in- 
florescence meristem (47). A con- 
served signaling pathway involving 
proteins in the CLAVATA (CLV) 
and WUSCHEL (WUS) families reg- 
ulates the plant growth hormone, 
cytokinin, which affects the size 
and number of stem cells in the 
meristem. Mutations that affect 
signaling in the CLV-WUS pathway 
can increase meristem size, row 
number, and yield in maize and 
green millet (42, 43) but increase 
floral organ number in rice (44). 
Despite these differences, a screen 
for alleles with signatures of selec- 
tion in both maize and rice identi- 
fied the same locus, which increases 
yield in both species through an 
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Fig. 4. Mutations that influence spikelet number in cereals provide insights into evolutionary mechanisms. (A) In wheat, the unbranched spike produces spikelets 
with a variable number of florets (average of three). (B) Mutations in wheat can cause the production of paired spikelets, similar to maize. (C) In two-row barley, only the central 
spikelet produces a floret. (D) In six-row barley, all three spikelets produce a floret and set seed. (E) Rice has many branches and produces single spikelets with a single 
floret. Mutations in rice that increase yield increase branch number and reiterate branches on the branches. (F) Maize produces many branches with paired spikelets, each of 
which produces two florets. Mutations in maize can cause the production of single spikelets, similar to rice and wheat, or can convert spikelet pairs to branches. 


increase in cytokinin and cell division (45). Mu- 
tations that increase cytokinin levels or signal- 
ing also increase the number of branches and 
yield in rice (46), and cytokinin has been impli- 
cated in branching in barley (47). However, the 
CLV-WUS pathway has not yet been function- 
ally characterized in wheat and barley, although 
it is an obvious target for crop improvement. 

Meristems that produce multiple spikelets 
are larger than single-spikelet meristems. Such 
meristems include the spikelet-pair meristem 
in maize, the mutant paired-spikelet or triple- 
spikelet meristems in wheat, and the triple- 
spikelet meristem in barley; the latter extends 
over almost half the circumference of the in- 
florescence (48). In maize, defects in the CLV- 
WUS pathway or the plant growth hormone 
auxin can cause the production of single in- 
stead of paired spikelets (49, 50). It seems 
likely that similar pathways are involved in the 
production of the triple-spikelet meristem in 
barley and in the independent origins of the 
paired spikelets in grasses. However, multiple 
ligands, receptors, and transcription factors, 
and even parallel pathways, converge on the 
CLV-WUS pathway in different meristem types, 
so the pathways that specify each meristem type 
in each crop will need to be identified. 


Outlook 


Grasses are an economic and ecological suc- 
cess story. We speculate that the large endo- 
sperm and well-developed embryo that are 
characteristic of grasses (Fig. 2) gave grains a 
head start in germination and seedling survival, 
in both ecological and agricultural settings. 
Grass genomic diversity provides the raw ma- 
terial for their morphological diversity. Ge- 
nomic sequencing has provided insights into 
the genetic basis of domestication and post- 
domestication breeding of cereal genomes 


602 5 AUGUST 2022 + VOL 377 ISSUE 6606 


[reviewed in (57)], the development of wood 
in bamboo (52), and the multiple independent 
origins of cold tolerance, photoperiod insen- 
sitivity, and C, photosynthesis (7, 70). The 
availability of functional genomics tools (53) 
will provide opportunities to move from genes 
to networks and to determine which parts of 
the pathway are conserved and which are 
species specific. These networks will enable 
modern-day agriculturists to determine how 
to domesticate orphan crops such as tef and 
fonio and to begin to understand how grasses 
have covered the world. 
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Grassland soil carbon sequestration: Current 
understanding, challenges, and solutions 


Yongfei Bai'?* and M. Francesca Cotrufo* 


Grasslands store approximately one third of the global terrestrial carbon stocks and can act as an 
important soil carbon sink. Recent studies show that plant diversity increases soil organic carbon (SOC) 
storage by elevating carbon inputs to belowground biomass and promoting microbial necromass 
contribution to SOC storage. Climate change affects grassland SOC storage by modifying the processes 
of plant carbon inputs and microbial catabolism and anabolism. Improved grazing management and 
biodiversity restoration can provide low-cost and/or high-carbon-gain options for natural climate 
solutions in global grasslands. The achievable SOC sequestration potential in global grasslands is 

2.3 to 7.3 billion tons of carbon dioxide equivalents per year (COze year”’) for biodiversity restoration, 
148 to 699 megatons of COze year™ for improved grazing management, and 147 megatons of 


COze year” for sown legumes in pasturelands. 


rassland ecosystems cover an area of 

52.5 million km?, accounting for ~40.5% 

of the Earth’s land surface excluding 

Greenland and Antarctica (7). Grasslands 

provide habitats for biodiversity, con- 
tribute to food production, and deliver many 
cultural services (J). They also store ~34% of 
the terrestrial carbon stock (7), with ~90% of 
their carbon stored belowground as root bio- 
mass and soil organic carbon (SOC), thus 
playing a vital role in soil carbon sequestration 
(1, 2). However, grasslands are highly vulner- 
able to human disturbance (e.g., overgrazing 
and land-use conversion to agriculture) and 
climate change (1-3). Worldwide, grasslands 
have undergone severe decreases in biodi- 
versity and ecosystem functions, leading to 
reductions in SOC storage (2, 4, 5). Here, we 
review the recent advances in our understanding 
of SOC dynamics, current challenges, and pos- 
sible solutions to enhance SOC sequestration in 
global grassland ecosystems. We address three 
questions: (i) How do key biotic and abiotic fac- 
tors regulate grassland SOC formation, turnover, 
and stability?; (ii) how do climate warming, 
alterations in precipitation, and fire affect SOC 
storage?; and (iii) how does grazing manage- 
ment affect SOC and how can improved prac- 
tices result in SOC sequestration? 


Mechanisms and drivers of SOC sequestration 


In grassland ecosystems, ~60% of net primary 
productivity is allocated belowground (6). 
Belowground carbon inputs are more often 
incorporated into SOC than aboveground 
inputs because of their chemical composition 
(e.g., aliphatic compounds and root exudates) 
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and their presence in the soil (Fig. 1) (6). On 
average, root carbon inputs have a SOC stabi- 
lization efficiency that is five times greater 
than aboveground carbon inputs (6). 
Organic carbon in soil is distributed between 
particulate organic matter (POM) and mineral- 
associated organic matter (MAOM) fractions, 


a Leaf litter 


© Ex vivo modification 


Abiotic controls 


with only a minor portion (1 to 2%) present as 
dissolved organic matter. POM and MAOM 
differ in their formation, physical and chem- 
ical properties, and mean residence times in 
soil (7, 8). POM is formed from the fragmen- 
tation of plant and microbial residues, and 
therefore is composed of lightweight fragments 
made of large polymers (Fig. 1). MAOM, by 
contrast, is formed from single small molecules 
that are leached from plant residues or exuded 
from plant roots, which associate to minerals 
directly (ex vivo) or after microbial assimilation 
(in vivo) as microbial necromass (7, 8). MAOM 
on average has a lower carbon:nitrogen ratio 
because of its proportionally higher microbial 
origin, its longer mean residence time in soils 
(from decades to centuries) compared with 
POM (<10 years to decades), and its strong 
chemical bonding to minerals and physical 
protection in fine aggregates (7, 8). Therefore, 
MAOM contributes to longer-term carbon 
sequestration in soil. Root exudates such as 
dissolved sugars, amino acids, and organic 
acids are the key pathway to MAOM formation 
largely through microbial in vivo transforma- 
tions (Fig. 1) (8, 9). Plant aboveground, root, 
and rhizodeposition inputs exhibit different 
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Fig. 1. Conceptual framework for key factors and mechanisms controlling SOC sequestration in 
grassland ecosystems. (1) Plant diversity controls on productivity, biomass allocation, and SOC inputs 
through litter and root exudates (6, 13, 14). (2) Key pathway of MAOM formation through microbial in vivo 
transformation (8, 17). (3) Pathway of POM formation through microbial ex vivo modification (8, 17). 

(4) Microbial necromass carbon (C) accumulation in MAOM (9, 11). (5) Climate change impacts on SOC 
sequestration through plant and microbial pathways (26, 28). (6) Grazing and fire impacts on SOC storage 
through pathways of plant and animal waste C inputs, compaction, and bioturbation (e.g., trampling and 
wallowing), microbial in vivo transformation, and microbial ex vivo modification (33, 36, 38, 46). C:N, carbon: 


nitrogen ratio; DOC, dissolved organic carbon. 
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Fig. 2. Patterns and climatic drivers of microbial necromass contribution to 
SOC. (A) Microbial necromass C contribution to SOC. (B) Fungal and bacterial 
necromass C concentrations. (C) Relationships of total microbial, fungal, and 
bacterial necromass C contributions to SOC with mean annual precipitation in the 
topsoil of grassland systems in Asia, North America, and Europe. Data are from 
Liang et al. (17) and Wang et al. (18). Only the topsoil microbial necromass C and 
corresponding SOC data (n = 223) were used for global and regional synthesis. 

All data were classified into different grassland types within regions on the basis of 
sampling site information from the original study, Asia (eight grassland types, 
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n = 122), North America (five grassland types, n = 47), and Europe (three grassland 
types, n = 54). Within each grassland type, mean and standard error for each 
variable were calculated across different sampling sites. General linear model 
analyses were performed to explore whether the total microbial necromass 

C contribution to SOC and fungal and bacterial necromass C concentrations differ 
among different regions. Values with different letters are significantly different at the 
P < 0.05 level. Simple linear regression was used to analyze to the relationship 

of mean annual precipitation with fungal, bacterial, and total microbial necromass 
C contributions to SOC across all grassland types on the global scale. 
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Fig. 3. Impacts of grazing intensity and improved management 
practices on SOC stocks. (A) Changes in SOC stock across different 
levels of grazing intensity compared with ungrazed control [data are 
from Eze et al. (5), Byrnes et al. (43), and Zhou et al. (44)]. (B) Impacts 
of inorganic and organic fertilizers, liming, and different grazing 
strategies on SOC stocks (mean + 95% confidence interval) [data 

are from Eze et al. (5), Byrnes et al. (43), and Gravuer et al. (50)]. 
(C) Impacts of improved management practices on SOC sequestration 
rate (mean + standard error) [management intervention data are from 
Conant et al. (42) and plant diversity data are from Yang et al. (4)]. 
The number of studies used for calculating the average is given for 
each grazing intensity or each type of management. The study duration 
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POM and MAOM formation efficiencies. Ap- 
proximately 46% of root exudates, 9% of root 
tissues, and 7% of aboveground carbon residues 
are transformed into MAOM, whereas 19% of 
root litter is transformed into POM across crops, 
grasses, and trees growing in the field and under 
controlled laboratory conditions (0). Thus, 
plants with greater carbon allocation to roots 
contribute more to soil carbon sequestration, 
particularly the formation of MAOM. However, 
it remains largely unclear how the contribu- 
tions of roots (root exudates and root litter) and 
aboveground inputs to SOC accumulation (POM 
and MAOM) change with grassland types, soil 
properties, and climate conditions. 

In grassland topsoils, 50 to 75% of SOC is 
found in MAOM. The average carbon:nitrogen 
varies from ~10 to ~12 for MAOM and from 
~16 to ~18 for POM (3); therefore, the accrual 
of SOC in MAOM requires substantially greater 
nitrogen than the equivalent accrual in POM 
(11). The formation of POM is primarily driven 
by climate (temperature and precipitation). By 
contrast, the accumulation of MAOM is con- 
trolled by soil properties such as silt and clay 
content, cation-exchange capacity, and micro- 
bial nitrogen availability, which means that it 
may saturate (8, 12). In European grasslands, 
topsoil carbon storage in MAOM saturates at 
~50 g Ckg ‘soil, beyond which the additional 
increase in SOC storage completely depends 
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upon accrual in POM (71). Currently, most 
European grasslands (80%) are below satu- 
ration, indicating a large capacity for SOC 
sequestration in their topsoils (71). 

Plant diversity is a key driver of SOC for- 
mation and storage (4). High plant diversity 
enhances SOC storage by elevating below- 
ground carbon (i.e., root biomass and root 
exudates) inputs (73, 14) and promoting mi- 
crobial growth, turnover, and entombment of 
necromass (15). Maintaining consistently high 
levels of biodiversity and root carbon inputs 
is essential for enhancing SOC storage and 
persistence in grasslands (Fig. 1). 

Fungi and bacteria have a strong influence 
on SOC accumulation, stabilization, and turn- 
over in grasslands (Fig. 1), as in other terres- 
trial ecosystems (6, 16). Microbial necromass 
plays an important role in SOC accumulation 
and stabilization (9, 17). In the topsoil of global 
grasslands, the contribution of the microbial 
necromass to total SOC ranges from 23 to 74%, 
with an average of 50% (Fig. 2A), which is 
greater than its contribution in agricultural 
and temperate forest soils (17, 18). The con- 
tribution of necromass to SOC changes with 
soil depth (78) and is typically dominated by 
fungal necromass, with the fungi-to-bacteria 
necromass carbon ratio ranging from 1.2 to 
4.1 across global grasslands (Fig. 2B). This is 
likely because fungi produce more chemically 
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recalcitrant structural compounds and have 
greater carbon use efficiency than bacteria 
(6, 16). Moreover, mycorrhizal fungi, which live 
in association with plant roots and derive their 
carbon directly from the plant, can regulate the 
carbon sequestration capacity in soil. Carbon 
sequestration capacity per unit nitrogen in soil 
is 1.7 times greater in ecosystems dominated 
by ectomycorrhizal fungi-associated plants (e.g., 
savannas, shrublands, and forests) than in sys- 
tems dominated by arbuscular mycorrhizal 
fungi-associated plants (e.g., nonwoody grass- 
lands) because ectomycorrhizal fungi can pro- 
duce enzymes to degrade organic nitrogen 
from plant litter (19). However, MAOM is rela- 
tively higher in ecosystems that are dominated 
by arbuscular mycorrhizal fungi (73), such as 
grasslands. 

Climate regulates the metabolic activity of 
microbes and thus controls large-scale patterns 
of microbial necromass and SOC storage (8, 20). 
At the global scale, cold, moist soils promote 
the accumulation of microbial necromass car- 
bon. The maximum microbial necromass car- 
bon occurs at a mean annual precipitation of 
900 to 1000 mm with a mean annual temper- 
ature <0°C (Fig. 2C), indicating high priorities 
for preserving the current stocks in these sys- 
tems. Few studies have measured the contri- 
bution of microbial necromass carbon to SOC 
in grassland soils, and data are lacking from 
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Africa, South America, and Australia (/7, 18, 20). 
Microbial diversity may also affect SOC storage 
by regulating the efficiency of microbial assim- 
ilation of carbon and the production of organo- 
mineral associations in soils (27). Recently, 
microbial diversity was found to promote the 
stabilization efficiency of grass litter-derived 
POM but to reduce that of MAOM (22). 


Climate change impacts on SOC sequestration 


Sixty-seven percent of the world’s grasslands 
are distributed in semiarid, arid, and cold cli- 
mates, with only 23% occurring in humid cli- 
mates (7). Thus, carbon sequestration in most 
grasslands is highly sensitive to climate change, 
which can exert strong and diverse impacts on 
SOC accrual and stability through plant- and 
microbial-mediated mechanisms (8). The im- 
pacts of climate change on soil carbon seques- 
tration often vary with grassland type, climate, 
and soil conditions. In semiarid steppe, warm- 
ing may enhance root-derived carbon input 
but inhibit the decomposition of MAOM by 
suppressing fungal growth and soil respira- 
tion, resulting in an increase in the MAOM 
pool (23). In humid tallgrass prairies, warming 
may increase C, grass cover and C,-derived 
carbon input into soil organic matter, but it 
also increases the decay rate of these fractions, 
resulting in a negligible change in soil car- 
bon sequestration (24). In alpine grasslands, 
warming-induced permafrost degradation 
reduces active-layer SOC storage by decreasing 
the stability of microbial networks and accel- 
erating SOC (and specifically POM) decay (25). 
A recent meta-analysis demonstrated that long- 
term (25 years) warming increases the ratios 
of ligninase to cellulase activity and enhances 
microbial utilization of recalcitrant carbon, 
leading to a 14% reduction in the topsoil re- 
calcitrant carbon pool (26). However, warming 
may increase the accumulation of root-derived 
carbon in the subsoil MAOM pool (27). POM is 
much more climate sensitive than MAOM (3, 11). 
The percent change in POM (-12.2%) with cli- 
mate warming is on average three times 
greater than that in MAOM (-3.8%) in global 
grasslands (28). This suggests that grasslands 
with a high proportion of MAOM will contrib- 
ute less to soil carbon-climate feedbacks. 
Future projected precipitation anomalies 
and long-lasting droughts (29, 30) will likely 
influence soil carbon sequestration of grass- 
land ecosystems by altering plant community 
composition, productivity and carbon alloca- 
tion, and microbial processes. In the semiarid 
steppe, increased precipitation promotes soil 
aggregation by stimulating fungal growth 
and increasing soil-exchangeable magnesium 
(23). Precipitation anomalies (increases and 
decreases) can substantially alter root-to-shoot 
ratios and vertical root distribution in grass- 
lands (37), thus regulating soil microbial growth 
and SOC storage. Reduced precipitation strong- 
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ly suppresses oxidase activity, whereas higher 
precipitation stimulates the activity of nitrogen- 
acquisition extracellular enzymes (32). However, 
on the global scale, only a negative tendency for 
POM and a positive tendency for MAOM and 
total SOC concentrations with increased precip- 
itation were observed in grasslands because of 
the limited data availability (28). 

Climate change-induced increases in fire 
frequency can substantially modify long-term 
SOC storage in grasslands, particularly in 
savanna grasslands, by intensifying nutrient 
limitation, which suppresses plant growth and 
carbon inputs. Elevated fire frequencies reduce 
soil carbon stocks on average by 0.21 megagrams 
of carbon per hectare per year (Mg C ha’ year’) 
in the upper soil layer (0 to 20 cm) in global 
savanna grasslands (33). However, a recent study 
showed that fire suppression (i.e., >60 years of 
fire exclusion) has little effect on total SOC stor- 
age (0 to 60 cm) in tropical savannas because 
C, grass-derived carbon dominates the SOC, 
particularly in deeper soil layers, where soil 
carbon is less affected by changes in fire fre- 
quencies (34). It remains unclear to what extent 
different fire regimes regulate plant diversity, 
above- and belowground biomass allocation, 
microbial-mediated processes, and SOC stor- 
age in shallower and deeper soil profiles. 


Impacts of grazing pressure on grassland 
soil carbon 


Natural grasslands are grazed by wild ungu- 
lates, which can enhance SOC storage because 
they graze for short periods of time and move 
across the landscape. This results in main- 
tained plant cover, diversity and productivity, 
promotion of species with deep roots, micro- 
bial processing with the formation of both 
POM and MAOM, and soil-mixing processing 
by soil fauna (35, 36). Increases in ecosystem 
metabolism and plant labile carbon inputs 
(e.g., root exudates) are expected to increase 
both the ex vivo and in vivo formation of 
MAOM (9, 10, 37). Conversely, increased root 
inputs and allocation to depth result in higher 
POM in the subsoil (6, 38). In addition, large 
herbivores create habitats for many bioturba- 
tors (e.g., fossorial mammals and soil macro- 
fauna) to loosen up soil and expose larger 
aggregates of soil organic matter to organo- 
mineral interaction by vertical soil mixing (36). 
However, both the direction and magnitude of 
effects of large wild herbivores on soil carbon 
storage can vary strongly with soil nutrient 
availability, across grasslands, and under dif- 
ferent levels of herbivore density. For exam- 
ple, a recent short-term study suggested that 
nutrient availability strongly moderates the 
impact of herbivore grazing on soil carbon 
sequestration in herbaceous grasslands (39). 
Large herbivore grazing increases the upper- 
layer soil carbon storage under elevated nu- 
trient (fertilization) conditions but has no effect 


on soil carbon storage under ambient nutrient 
conditions (39). Sandhage-Hofmann et al. (40) 
report that elevated elephant densities enhance 
SOC stocks [4.7 tons (t) ha~’] despite losses 
of woody biomass in moist, semiarid, wood- 
encroached savannas of south-central Africa. 
However, a synthesis of 174 experiments showed 
that large herbivore exclusion generally increases 
SOC storage across different biomes (grassland, 
forest, shrubland, tundra, woodland, etc.), sug- 
gesting an overall negative impact of large wild 
herbivores on soil carbon storage (42). 
Livestock grazing is the most common use 
of grasslands worldwide. Some grasslands 
are managed to improve forage quantity and 
quality, thereby increasing livestock production 
and/or SOC storage (1, 2, 42). In livestock- 
dominated systems, these pathways are strongly 
controlled by grazing intensity and rest periods. 
Continuous livestock grazing reduces plant 
cover, diversity, and productivity, and thus 
root inputs and plant- and microbial-mediated 
SOC formation, while stimulating losses 
through microbial turnover and erosion caused 
by increased compaction and reduced cover 
C1, 2, 43). Eze et al. (5) demonstrated that 
livestock grazing on average decreases SOC 
stock by 15% across five continents, with the 
greatest reduction (-22.4%) in SOC stock in 
the tropics and the least reduction (-4.5%) in 
temperate grasslands. At the global scale, light 
grazing (e.g., seasonal and rotational grazing) 
shows the least negative effects or even pro- 
motes soil carbon storage, whereas moderate 
and heavy (continuous) grazing consistently 
reduces soil carbon stocks (Fig. 3A) (5, 43, 44). 
For a given category of grazing intensity, the 
discrepancy in magnitude of changes in SOC 
stocks between these studies may partly arise 
from the lack of quantitative measures of 
grazing intensity and the difference in data 
sources (5, 43, 44). Nevertheless, the magni- 
tude and directions of grazing impacts on soil 
carbon sequestration are context dependent 
and vary with climate and soil conditions, 
vegetation properties, livestock type, herbivore 
diversity, grazing strategies (e.g., continuous 
versus rotational grazing), and grazing inten- 
sity and duration (5, 38, 43-45). The negative 
impact of increasing grazing intensity on SOC 
is lessened with greater water availability 
(5, 44) but is more severe with warmer tem- 
peratures and longer grazing duration in tem- 
perate grasslands (44). With moderate and 
heavy grazing, SOC increases in grasslands 
dominated by C, species and decreases in 
grasslands dominated by C3 species (45). Sheep 
grazing generally has a greater negative im- 
pact on SOC than cattle grazing, and the re- 
duction in SOC with grazing is substantially 
greater in topsoil than that in subsoil (44). A 
mixed cattle and megaherbivore system was 
shown to be a sustainable management strat- 
egy in African savanna ecosystems with high 
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Fig. 4. Soil C sequestration potential of grassland ecosystems at the regional 
and global scales. SOC sequestration potentials are arranged according to 
average SOC sequestration rates and area of each management strategy for 
denoting their relative contributions. (A) Capacity and attainability of SOC 
sequestration by restoring degraded grasslands [data are from White et al. (1), 
De Deyn et al. (52), Deng et al. (54), and Fargione et al. (53)]. At the global 
scale, SOC sequestration potential is presented as theoretical, realistic, and 
achievable, respectively, based on Chapman (55). Means and the 95% confidence 


herbivore diversity (46). Moreover, rotational 
grazing consistently shows higher SOC stocks 
compared with continuous grazing (or free 
grazing) (43), with gains observed specifically 
in the mineral associated fraction (47). 


Managing for soil carbon storage in grasslands 


Empirical and experimental studies have indi- 
cated that improving grassland management can 
increase SOC storage, thus mitigating carbon 
losses caused by climate change, long-term over- 
grazing, and grassland degradation (2, 42, 48). 
Management improvements may result in soil 
carbon accrual through several interrelated 
mechanisms (Fig. 1). Conversion from crop- 
lands to grasslands removes disturbance from 
tillage and increases root carbon inputs to soil 
(6, 42). Restoring the biodiversity of degraded 
grasslands may increase plant production and 
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promote microbial turnover and necromass 
entombment (4, 13, 15). Grazing improvement 
can increase higher-quality root carbon (lower 
carbon:nitrogen ratios) inputs (38) and/or 
nitrogen retention, thus promoting the forma- 
tion and persistence of MAOM in soils (47). 
Sowing legumes increases soil carbon and 
nitrogen inputs by elevating root biomass, 
root exudates, and fine root turnover (42, 49). 
Applications of inorganic and organic fertilizers 
may stimulate primary productivity and high- 
quality plant carbon inputs to soil, resulting in 
more efficient microbial carbon use (5, 28, 50). 

Anumber of management interventions 
have been adopted to restore grasslands 
(Fig. 3, B and C). On the global scale, the im- 
proved grassland managements increase SOC 
stocks on average by 0.47 Mg C ha” year * 
(42). This suggests that the world’s grazing 
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51)]. Only maximum 


lands, which occupy an area of ~34 million km?, 
have substantial potential to increase SOC stor- 
age (Fig. 4). Among all improved manage- 
ment practices, conversion from cultivation to 
grasslands, increasing plant diversity, sowing 
legumes and grasses, and fertilization are asso- 
ciated with the highest soil carbon sequestra- 
tion rates (Fig. 3C) (4, 42). Under moderate 
grazing intensity, the average SOC stock in- 
crease (28.4%) is substantially greater with 
rotational grazing than that with continuous 
grazing (Fig. 3B). In the southeast United 
States, grassland soils managed with adaptive 
multi-paddock grazing that used a high-density- 
short-duration rotational grazing had more car- 
bon (72.49 Mg C ha”? and nitrogen (9.26 Mg 
Nha‘) stocks compared with continuous graz- 
ing (64.02 Mg C ha’ and 8.52 Mg N ha”) in 
the 0 to 100 cm soil layer (47). However, the 
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direction and magnitude of management ef- 
fects on soil carbon stocks are context specific, 
depending on factors such as climate, plant 
community composition, and soil properties 
(5, 43, 50). Therefore, grazing practices need 
to be implemented with an understanding of 
context. Moreover, further studies are required 
to examine the synergy and trade-offs among 
grassland biodiversity, primary productivity, 
and soil carbon sequestration under man- 
agement interventions. 

Soil carbon sequestration potential varies in 
both quantity and attainability among grass- 
lands with different degrees of degradation 
and across different regions (Fig. 4). Given that 
~50% of the global grassland area has been 
degraded (J, 2), restoration of grassland cover 
and biodiversity is an effective strategy for 
promoting SOC storage and mitigating the 
negative impacts of global climate change 
(4, 15, 51-53). For example, the SOC accrual 
rate with grazing exclusion is on average 
0.68 Mg C ha year’ in topsoil (0 to 30 cm) 
and 0.62 Mg C ha" year’ in subsoil (30 to 
100 cm) across 145 degraded grassland sites in 
China (54), indicating that it has not reached 
saturation over the 27-year period of grassland 
restoration. 

Potential soil carbon sequestration capacities 
can be categorized as theoretical, realistic, or 
achievable (55). Theoretical soil carbon seques- 
tration capacity refers to the estimate of restor- 
ing all soils to their natural capacity or even 
enhancing it through management interven- 
tions, realistic soil carbon sequestration capac- 
ity refers to the optimistic value accounting for 
social and economic constraints, and achieva- 
ble capacity is the value of a pragmatic sce- 
nario based on the current trends (55). At the 
global scale, the mean theoretical, realistic, 
and achievable capacities of SOC sequestration 
with grassland restoration are estimated to be 
10.2, 6.8, and 3.4 billion t CO, equivalents per 
year (COve year‘), respectively (Fig. 4A). At 
the regional scale, Africa, Asia, and Europe 
are projected to have the largest achievable 
capacity of soil carbon sequestration with 
grassland restoration, with Oceania and North 
and South America exhibiting the least SOC 
sequestration potential (Fig. 4A). These global 
patterns of SOC sequestration potential are 
primarily caused by the differences in average 
soil carbon sequestration rate and the area of 
degraded grassland in different regions. The 
greater SOC sequestration potential with grass- 
land restoration in Africa and Asia is due to the 
larger areas of degraded grasslands in these 
continents, whereas European grasslands have 
a higher average soil carbon sequestration rate 
(Fig. 4A). In addition, optimizing grazing in- 
tensity (e.g., rotational grazing) is projected to 
increase soil carbon sequestration potential 
by 148 to 699 megatons (Mt) COyve year‘ in 
global grazing lands (Fig. 4B), with the greatest 
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SOC sequestration potential occurring in Cen- 
tral and South America, Africa, and Asia (57). 
Moreover, sowing legumes is projected to 
enhance SOC storage by 147 Mt COze year * 
in global pasturelands (57), with Europe ex- 
hibiting the greatest soil carbon sequestration 
potential caused by both the largest pastureland 
areas and the highest average soil carbon se- 
questration rate (Fig. 4C). At both the regional 
and global scales, large uncertainties exist 
regarding the projected soil carbon seques- 
tration potential and rate of accrual. These 
uncertainties are caused by the complex inter- 
actions among climate change, human activ- 
ities, and spatial and temporal variations in 
ecosystem and soil responses (51, 53, 56). 
Scientific research and management innova- 
tions are required in the future to maximize the 
attainable SOC storage in global grasslands. 


Conclusion 


Recent studies have made considerable prog- 
ress toward addressing major challenges as- 
sociated with identifying the capacity and key 
mechanisms of various grasslands to sequester 
and preserve carbon in soils and developing 
knowledge-based strategies to restore bio- 
diversity, preserve current SOC stocks, and 
promote additional sequestration for climate 
change mitigation and sustainable manage- 
ment in grasslands. These advances highlight 
the important roles of plant and soil bio- 
diversity in regulating the formation of micro- 
bial necromass carbon, MAOM, and POM, 
mediating the impacts of climate change, 
and promoting SOC storage through manage- 
ment improvements and restoration in global 
grasslands. They also demonstrate that the 
impacts of climate change, grazing, fire, grass- 
land restoration, and mitigation solutions on 
soil carbon sequestration are moderated by 
multiple context-dependent factors. Future 
research is needed to address the uncertainty 
and context dependency of the proposed miti- 
gation solutions and their carbon sequestration 
potentials and to consider their possible syn- 
ergies and trade-offs for biodiversity conserva- 
tion, climate mitigation, and food production. 
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The planetary role of seagrass conservation 


Richard K. F. Unsworth*2*, Leanne C. Cullen-Unsworth'”, Benjamin L. H. Jones, Richard J. Lilley” 


Seagrasses are remarkable plants that have adapted to live in a marine environment. They form extensive 
meadows found globally that bioengineer their local environments and preserve the coastal seascape. With the 
increasing realization of the planetary emergency that we face, there is growing interest in using seagrasses 
as a nature-based solution for greenhouse gas mitigation. However, seagrass sensitivity to stressors is acute, 
and in many places, the risk of loss and degradation persists. If the ecological state of seagrasses remains 
compromised, then their ability to contribute to nature-based solutions for the climate emergency and 
biodiversity crisis remains in doubt. We examine the major ecological role that seagrasses play and how 
rethinking their conservation is critical to understanding their part in fighting our planetary emergency. 


hough commonly called grasses, sea- 
grasses are a unique group of submarine 
flowering plants that belong to the mono- 
cotyledon order Alismatales, comprising 

four families and 72 species. Although they 
occupy a broad range of niches and are derived 
from multiple evolutionary lineages (J), they all 
share a connection to marine environments and 
consistently exhibit features that separate them 
from all other angiosperms. Seagrasses have 
adapted to live underwater, where light is lim- 
ited, where salt and nutrients can be problem- 
atic, and where soils can become highly toxic (2). 
Seagrass diverged from other alismatid 
monocots ~105 million years ago, and work 
by Olsen et al. (3) supports hypotheses that 
modern seagrass biodiversity can be linked to 
the materialization of multiple habitats after the 
Cretaceous-Paleogene extinction event. In the 
past decade, the seagrass science community 
has grown (4) and revealed the uniqueness of 
these plants and the importance of the ecosys- 
tems that they create (Fig. 1). Seagrasses bioen- 
gineer their environment by slowing water flow, 
trapping particles, and improving the environ- 
ment within a positive feedback mechanism to 
facilitate the creation of habitat (5). Just like 
terrestrial plants, their reproduction can be sup- 
ported by a diverse range of pollinators, such as 
cumacean crustaceans (6), and seed dispersers, 
such as fish (7). Their reproduction is not always 
sexual—genetic evidence has revealed that veg- 
etative growth has led to the establishment of 
one single clonal organism spanning >180 km of 
coastline (8). Nitrogen-fixing bacteria living with- 
in their roots allow them to colonize nitrogen- 
poor environments (9), and associations with 
clams (and their bacterial symbionts) have aided 
their ability to inhabit otherwise toxic sulphide- 
rich marine soils (70). There is also growing evi- 
dence of the presence of fungi associated with 
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the roots and rhizomes of seagrasses, indicating 
that they may play essential roles similar to those 
of fungal associates of terrestrial plants (7). 
Aside from their ecological uniqueness, sea- 
grasses are of increasing interest in a socio- 
political context owing to their potential to help 
combat the current climate and _ biodiversity 
crises that our planet faces. Seagrass meadows 
also support human well-being by virtue of their 
role in supporting fisheries, coastal protection, 
and water filtration (72), and action for their 
conservation supports the fulfilment of the 17 
Sustainable Development Goals (SDGs) pro- 
posed by the United Nations in 2015. Seagrasses 


“Compared with...terrestrial 
grasses and even seaweeds, the 
hotly of research within seagrass is 
magnitudes smaller...” 


also support many species of conservation con- 
cern, such as the dugong, green turtle, and man- 
atee (13), and provide interacting ecological 
functioning throughout the coastal seascape (14). 

To harness the power of seagrass as a nature- 
based solution to the climate emergency and the 
biodiversity crisis, seagrass systems must be in a 
resilient functioning state. Seagrass meadows 
remain globally threatened by diverse factors, 
including poor water quality, damage from boats 
and related activities, aquaculture, and coastal 
development (75). Even in areas where seagrass 
is protected, extreme climate drivers place sea- 
grass at risk. For example, after a marine heat- 
wave in 2010 to 2011, up to 699 km? of seagrass 
meadow in the Shark Bay Marine Park in West- 
ern Australia were lost or damaged, potentially 
releasing up to 9 Tg of CO, back into the atmo- 
sphere during the 3 years before regrowth oc- 
curred (16). Seagrass sensitivity to stressors is 
acute and may even extend to the effects of an- 
thropogenic noise (17). In many places, the risk 
of seagrass loss and degradation persists (15), 
and its functional state is commonly compro- 
mised; unless this can be reversed, the potential 


for seagrass to contribute to the complex jigsaw 
of nature-based solutions remains in doubt. In 
this Review, we reflect on the status of seagrass 
ecosystems, the major ecological role that they 
play in the coastal environment, and how re- 
thinking their conservation is critical to allowing 
them to play a role in reversing climate change. 


Global decline, net-zero loss, and achieving 
net gain 

The role that seagrass can have in reversing or 
mitigating climate change requires considera- 
tion of their global biogeochemical contribution. 
For this, we first need a better understanding of 
whether seagrasses are currently in a state of net 
loss, stasis, or net gain globally, along with the 
parameters that drive their greenhouse gas bal- 
ance (Fig. 1). The global coverage of seagrass is 
currently estimated to be 160,387 to 266,562 km” 
(18). This range reveals that we have very limited 
understanding of the actual extent of seagrass 
populations. We also do not fully understand the 
extent of the ecological goods and services that 
seagrass provides, including to biodiversity and 
coastal protection. Studies have sought to place 
estimates on seagrass loss at 1 to 7% per year 
(19, 20) and create global carbon storage esti- 
mates of up to 19.9 Pg (21, 22). However, if we do 
not know how much we have or have had, we 
cannot hypothesize very well on what has been 
lost or its associated ecological relevance. 

The reported trajectory of seagrass coverage 
(20, 23) indicates that it may be recovering in 
some areas; however, this analysis is limited 
because it only focuses on locations where sea- 
grass is mapped, monitored, and likely affected 
by some level of conservation action, and it may 
represent only a fraction of potential and un- 
known seagrass area. Analyses are also limited 
by favoring data published in academic journals 
and excluding available data in the gray litera- 
ture. A coordinated global effort is required to 
create meaningful global estimates of seagrass 
coverage and change that are validated with open 
data sharing between governments, academics, 
nongovernmental organizations, and commer- 
cial enterprises (78). In the UK, a technology- 
focused consortium is forming to fill the gaps in 
our knowledge to help drive understanding of 
the ecological role of seagrasses (24), and recom- 
mendations for a methodological pathway to 
improve the global seagrass map have recently 
been proposed (78, 25). 


Seagrass as a nature-based solution 


The growing interest in nature-based solutions 
is necessitating deeper understanding of the 
ecological role that seagrass meadows play in 
the context of climate change. Seagrass mea- 
dows store and sequester carbon within their 
sediments over long periods of time at highly 
efficient rates; however, this role varies over space 
and time along with factors such as hydrody- 
namics and species composition influencing 
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this function. Additionally, despite their more 
obvious role in the storage of organic carbon, 
seagrasses, like most vegetation, also produce 
the greenhouse gases methane (CH,) and ni- 
trous oxide (N,0). The balance of these emis- 
sions relative to the storage of carbon is of 
principal importance in the context of their role 
in influencing climate. Limited understanding 
exists with respect to whole-seagrass ecosystem 
greenhouse gas balance (Fig. 2). Available data 
indicate that seagrasses have broadly lower 
greenhouse gas emissions of CH, and N,O 
than comparative coastal and wetland habitats 
and that low salinity and anthropogenic stres- 
sors are major processes driving production (26). 
Similarly, comparison with habitats such as 
peatlands and mangroves shows seagrasses to 
be relatively low in CH, and NO (27). However, 
after seagrass meadow degradation and loss, 
there exists a potential for high emissions of 
CH, from underlying sediment (28). Eutroph- 
ication of seagrasses may also drive elevated 
N.O emissions. Although scientific understand- 
ing in this field is increasing rapidly, our lack of 
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understanding of the drivers of greenhouse gas 
emissions by plants, least of all by seagrasses 
(21, 27), contributes to the uncertainties that 
surround the marketing of blue carbon (29). 

Although its capacity for carbon storage is of 
high current interest, human appreciation for 
the ecological role of seagrasses has changed 
(30). An historic view of seagrasses from the 
Northern Hemisphere shows their importance 
in food production and as a raw material. For 
example, house roofs in Denmark were thatched 
with dried seagrass (some of which can still be 
seen), and seagrass detritus was used to fertilize 
crops (30). In the late 1800s, when Indian cotton 
crops failed, documented discussion by British 
cotton traders turned to the use of seagrass as an 
alternative fiber. In North America, companies 
existed that traded in seagrass as an insulation 
material, which was subsequently used in the 
US Capitol building. The Seri people of the Gulf 
of California collected seagrass seed to create a 
gruel (37). In the 21st century, in many parts of 
the world, seagrass meadows are a source of 
food from the gastropod and bivalve mollusks 


a . Fn = ~— ANS 

Fig. 1. Seagrass and biodiversity. (A to C) Seagrass meadows contain biodiverse and enigmatic species assemblages, including the leafy sea dragon (A), sea stars 
(B), and predators such as crocodiles (C). (D) The biodiversity and productivity of seagrass meadows also lead to them storing and sequestering substantial 
amounts of carbon in their sediments. Seagrass meadows provide habitat in support of biodiversity [(A) to (C)] in coastal waters globally. When healthy and in a 
balanced state, seagrass can be a great source of many other ecosystem services, such as water filtration, carbon storage (D), and coastal defense. Anthropogenic 
factors, such as coastal development and poor water quality leading to eutrophication of coastal waters, are some of the principal drivers of seagrass decline. 


and sea cucumbers that they shelter (32). The 
importance of seagrass habitats as a source of 
seafood production is both direct and indirect 
at local and basin-wide scales, with 20% of the 
world’s biggest finfish fisheries having some 
known association with seagrass (33). 
Seagrasses also play a fundamental role in 
the filtration of coastal waters, trapping parti- 
cles (including microplastics), cycling nutrients, 
and absorbing nitrogen from the water column 
(34). This filtration role also extends to the re- 
moval of bacteria and viruses (35-37), thus con- 
tributing to improved sanitation (38) and 
human health and well-being (72). In the Baltic 
Sea, seagrass meadows have been recorded to 
contain 63% fewer potentially harmful Vibrio 
vulnificus and Vibrio cholerae bacteria com- 
pared with nonvegetated areas (37). 
Additionally, the role of seagrass in protecting 
coastlines from erosion is substantial and may 
grow in value with sea level rise and as storms 
become more frequent (17). The locally relevant 
role of seagrasses in ameliorating low pH from 
ocean acidification may also increase the value 
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of these marine plants over time (39, 40). Al- 
though the ecological roles that seagrasses play 
around the world shift with space and time, the 
constant across most of the world’s seagrasses is 
that they remain at ecological risk and many are 
in a perilous state. 


What is a pristine, healthy, or balanced 
seagrass ecosystem? 


The extent and function of seagrass meadows 
are largely manifestations of current and previ- 
ous human activity. We have limited capacity to 
appreciate the value of seagrass owing to the 
scale of alteration and unknown baselines for 
these systems (41, 42). Evidence from ecological 
feedbacks indicates that seagrass meadows are 
driven by top-down and bottom-up processes 
(43, 44). Although there is increasing appreci- 
ation for how seagrass might be influenced by 
excess nutrients and various pollutants in our 
coastal waters, we have limited appreciation 
for what extreme overexploitation of near-shore 
environments has done to seagrass meadows. 
We simply do not know what a so-called pristine 
meadow looks like, which creates a limited ap- 
preciation for the true ecological role of these 
poorly understood systems. A contributory fac- 
tor to the poor understanding is the low relative 
research output on seagrasses [see (45)]. How- 
ever, it is apparent that there has been a pro- 
found loss of predators from these systems, 
whereas numbers of consumers, secondary con- 
sumers, and grazers have also been affected 
(46)—in some cases, loss of predators has led 
to overgrazing (47, 48). 

In localities where associated biodiversity is 
high, functional redundancy may serve to pro- 
tect seagrass meadows (49), but with decreasing 
diversity away from the tropics, such redun- 
dancy may be reduced. There is also a growing 
appreciation for seagrass as a foraging resource 
for seabirds; this is because they support abun- 
dant prey items, such as crustaceans, polychaetes, 
and fish (50). Given the parallel global decline 
of avifauna with global seagrass, we can only 
speculate as to what the functional role of loss 
of seagrass might have once been (57). 

In recent decades, biodiversity and ecosys- 
tem functioning has evolved into a dynamic 
area of contemporary ecology with a rich body 
of research. Compared with research in terres- 
trial grasses and even seaweeds, the body of re- 
search within seagrass is magnitudes smaller 
and is fueled by a smaller community of scien- 
tists. We must understand the biodiversity asso- 
ciated with seagrass meadows to be able to 
develop management programs that secure their 
ecological functioning under further climate 
change. Global and regional studies are begin- 
ning to transform our knowledge (44, 52, 53), 
but tools such as sequencing environmental DNA 
need to be more widely applied. Reconstruc- 
tions using molecular and historical evidence 
are needed to understand the true ecological 
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potential of these ecosystems, to locate sites for 
rehabilitation and replanting, and to provide 
ambition to marine conservation. 


Seagrass meadows and the SDGs 


Improved protection and restoration of sea- 
grasses require better recognition of the role 
that they play in supporting people and our 
planet; the state of seagrasses is symptomatic 
of the deteriorating state of the overall natural 
environment (54). The United Nations SDGs 
are a means of framing a response to this emer- 
gency by connecting the daily actions and needs 
of people, institutions, and communities to the 
sustainability of the planet and transforming 
these connections into measurable actions for 
positive environmental, social, and ecological out- 
comes. Articulating the ecological role of seagrass 
in terms of ecosystem services and natural capital 
promotes a scientific vision of what behavioral 
change might mean for seagrass, whereas the 
SDGs provide a framework for how change can 
be perceived by all people. We suggest that, of 
the 17 SDGs, action for seagrass conservation 
and restoration can make a meaningful contri- 
bution to 16 of these global goals (Fig. 3). We 
propose that the ecological role and value of 
seagrass can also be described in these terms to 
improve and catalyze action to halt and reverse 
seagrass loss. 

Seagrass meadows form globally relevant 
habitats that support fisheries and associated 
economic goods; it is in this ecological role that 


Healthy productive seagrass 


NO, CH, 
( CO, flux 
SS 


seagrasses play a prominent role in SDGs. Thus, 
well-managed, sustainably exploited seagrass 
meadows that are in a state of ecological balance 
(32, 33, 55) will contribute to reducing poverty 
(56), reducing hunger (32), responsible consump- 
tion and production (57), and decent work and 
economic growth (58) (Fig. 3). Sustainably 
managed seagrass fisheries in many parts of the 
world also contribute toward gender equality 
and reducing other inequalities. For example, 
the role of women is underappreciated in inter- 
tidal and near-shore small-scale subsistence fish- 
eries (59), of which seagrass meadows are a 
major component. Inclusion of women in these 
fisheries is well known to improve community 
adaptative capacity and resilience (60), leading 
to improved environmental outcomes (59). 

A major ecological role of healthy seagrass 
systems is to make the wider environment more 
conducive for animal life (including humans) in 
both marine and coastal environments. Seagrass 
habitats enhance oxygenation in marine sedi- 
ments; trap particles in the water column, im- 
proving water clarity; cycle and store nutrients; 
and reduce the bacterial and viral load in coastal 
waters. This creates a three-dimensional envi- 
ronment that harbors biodiversity, baffles wave 
energy to protect coastlines from erosion, and 
further enhances the whole coastal seascape for 
biodiversity (e.g., through the protection of adja- 
cent habitats, such as coral reefs and mangroves). 

The bioengineering effect that seagrasses have 
on their own environment also contributes to 


: Disturbed and eutrophied seagrass 


NO, CH, 


Fig. 2. The greenhouse gas balance of seagrass. There are many competing processes that result in 
seagrass meadows becoming net sources or sinks of greenhouse gases in our oceans. The left panel 
illustrates a healthy meadow where net photosynthetic productivity and dense seagrass is leading to rapid 
trapping and storage of carbon into the sediments. Although we lack a full understanding about greenhouse 
gas balance in seagrasses and the implications of disturbance, the right panel illustrates how meadow 
degradation and eutrophication can lead to the remobilization and loss of stored carbon and the potential 
increased production of CH, and N20. We also know little about the consequences of calcification by 
associated fauna within productive seagrass meadows on the overall carbon balance. 
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SDGs related to clean water and sanitation, good 
health, and well-being (72). Additionally, there is 
increasing appreciation of the value of seagrass 
for storing and sequestering carbon and the 
potential value of conserving seagrass meadows 
for climate mitigation (67). We understand that 
seagrasses enhance life below water, but less- 
well appreciated is that seagrass systems also 
enhance life on land by providing resources to 
shoreline habitats and populations, especially 
birds (62). The biodiversity present within sea- 
grass meadows, the ecological processes and 
functions within them, and their relatively easy 
access also provide educational opportunities 
for human communities (63). 

Without strong partnerships between commu- 
nities, governments, nongovernmental organiza- 
tions, and the private sector, seagrass conservation 
and restoration will not work effectively. The 
final SDG is about this bigger ambition. In the 
UK, the conservation charity Project Seagrass is 
bringing together private sector companies (e.g., 
CGI and Ocean Infinity), universities (e.g., Swansea 
and Heriot-Watt), institutes (e.g., NOC), and the 
government (e.g., the Hydrographic Agency) to 
map the UK’s seagrass meadows. Similar ini- 
tiatives are happening globally in places such as 
the Seychelles, Australia, and Indonesia. 

Many aspects of the SDGs focus on the hu- 
man planet, where the role that seagrasses play 
is changing with respect to a changing climate. 
With an expanding need to harness the energy 
of our oceans through wind, waves, and tide, 


there is increasing potential for new infrastruc- 
ture to come into conflict with seagrass eco- 
systems. At the same time, this could lead to 
improved outcomes for seagrass, especially at 
atime when there is increasing global recogni- 
tion of the need to develop strong criteria and 
indicators for pathways toward nature-positive 
outcomes. One such mechanism is that adopted 
in Australia, where marine biodiversity offset- 
ting is accepted as a component of development 
consent to achieve an ambition of no net loss of 
biodiversity. A failed push toward tidal lagoon 
power in the UK provided impetus for seagrass 
restoration, and there is a growing focus on using 
seagrass restoration as a means of enhancing 
fish habitat as an offset to the effect of offshore 
wind power installations on marine biodiversity. 
The decline and reduced use of major historic 
urban coastal infrastructure, such as disused 
docklands, fisheries ponds, and mill ponds, are 
typical of many areas of the temperate Northern 
Hemisphere. The large empty docklands of South 
Wales provide an exemplary opportunity for 
seagrass restoration, and in southern Spain, 
entrepreneurial restaurateurs are bringing dis- 
used salt ponds back to life with seagrass for 
the growth of food products (64). 


Charting a pathway to the net 
recovery of seagrass 


Solutions for seagrass conservation and resto- 
ration have never been more urgent given the 
ongoing risks they face (15) and their potential 


role in helping mitigate climate change and 
the biodiversity crisis (27). Given the real and 
immediate threat of runaway climate change 
that places the future of humanity at risk, we 
need to rapidly move toward a conservation 
and restoration model that focuses on achiev- 
ing global net recovery of seagrass (Fig. 4). 
Although financial mechanisms are emerging 
that begin to place monetary value onto sea- 
grass carbon stores and carbon sequestration 
potential that will enable greater conservation 
and restorative action, concern exists about the 
potential for perverse and unintended conse- 
quences of such mechanisms (including interna- 
tional ownership of local resources), particularly 
around their role in supporting livelihoods (56). 

It has been argued that avoiding a climate 
catastrophe requires at least three global trans- 
formations that are unprecedented in both 
magnitude and speed (54). One of these is a 
transformation of our relationship with nature 
to one that conserves, restores, and enhances its 
benefits for people and the planet (54). The 
SDGs could provide a valuable lens for secur- 
ing the wider ecological role of seagrass mea- 
dows beyond carbon sequestration. 

Seagrass habitats are global; estimates of loss 
are widespread and varied, but there is general 
agreement that the loss is vast. However, this 
does mean that there is huge potential for 
nature-based solutions focused on seagrass 
restoration. A restored seagrass meadow may 
take many years and be high cost in terms of 


Seagrass conservation supports 16 of the 17 Sustainable Development Goals 


1 No poverty 

Seagrass ecosystem services for 
poverty alleviation (substrate for 
living, subsistence, and livelihoods) 


5 Gender equality 
Empowering women in seagrass fisheries 
(access to food and income for women) 


10 Reduced inequalities 


6 Clean water and sanitation 


2 Zero hunger 
Seagrass subsistence fisheries 
support zero hunger 


3 Good health and well-being 
Seagrass bioengineers its environment, 
making it more affable and increasing 
the nutritional value of fish 


4 Quality education 

Seagrass ecosystems provide a means 
of teaching children core scientific 
principles (e.g., photosynthesis and 
environmental values) 


Healthy seagrass filters and cleans water 


7 Affordable and clean energy 
Seagrass restoration can be embedded 
in marine renewable energy 


8 Decent work and economic growth 
Sustainable seagrass fisheries and green 
restoration jobs promote seagrass 


9 Industry, innovation, and infrastructure 
Opportunity for using seagrass as a 
trailblazer for the uptake of net gain within 
industrial marine biodiversity loss 


seagrass restoration 


12 Responsible consumption and 


production 


13 Climate action 


Management of seagrasses and their 
fisheries supports the underappreciated 
role of women in these activities 


11 Sustainable cities and communities 


Old coastal urban heritage infrastructure 
(e.g., ports) creates opportunities for 


Seagrass meadows store and sequester carbon 


14 Life below water 
Seagrasses bioengineer the seabed, 
enhancing life and biodiversity underwater 


15 Life on land 

Seagrasses support coastal defense, 
provide trophic subsidy to the coast, 
and support coastal avifauna 


16 Peace, justice, 
and strong institutions 
No major role 


Seagrass conservation requires sustainable 
management of associated resources 


17 Partnerships for the goals 
Improved seagrass conservation and 
restoration activity requires strong 
cross-sectoral partnerships 


Fig. 3. Seagrass and sustainable development. Conservation and restoration of seagrass meadows and their ecological role can be communicated through the lens 
of the SDGs, of which seagrasses contribute to 16 of the 17 goals. A major part of this contribution is through the roles that they have as bioengineers and in 
supporting fisheries. 
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2030 and beyond 
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Fig. 4. A trajectory of seagrass recovery. Business as usual: Although seagrass meadows provide extensive 
ecosystem services and offer a major global opportunity as nature-based solutions, without intervention, they 
remain on a trajectory of global decline throughout the next century. Pathway to net gain: If major conservation 
action is taken to halt and reverse seagrass loss and degradation, then seagrasses can provide major contributions 
to fulfilling the aims of 16 of the 17 SDGs and for providing a major nature-based solution to climate change. 

Net gain of biodiversity requires avoidance of damage (e.g., legal instruments to halt bottom trawling) or 
minimization of effects that cannot be avoided, restoration to enhance or recreate habitats after damage (e.g., 
advanced mooring systems to allow recovery from boat damage), compensation, and recovery to enhance or 
recreate habitats known to have been historically lost or degraded (e.g., by active replanting). Image uses silhouettes 
created using symbols from the IAN Library, UMCES, University of Maryland. 


labor and infrastructure to become ecologically 
functional (65, 66). The opportunity provided 
by seagrass restoration should not detract from 
the urgent need to protect what we already 
have. As seagrass meadows become degraded, 
they not only begin to become net emitters of 
carbon, but they also release large amounts of 
nitrogen and sediments into the coastal eco- 
system (34), together with any potential con- 
taminants trapped within (e.g., heavy metals 
or plastics) (67). Achieving no net loss (and ulti- 
mately global net gain) of seagrasses requires 
scientific vision and political will (Fig. 3). This 
will not be easy, but we know that cumulative 
and connected conservation of seagrass over 
large scales can have major economic and en- 
vironmental benefits (65). In general, plant 
conservation lags behind the conservation of 
animals (68), but seagrass could provide a model 
for how to overcome this so-called plant blind- 
ness, especially in the context of nature-based 
solutions (69). 

Seagrasses have previously been described as 
the “ugly duckling” of marine conservation (70), 
but their star has risen with increasing interest 
in their potential to contribute to nature-based 
solutions to climate change and sustainable de- 
velopment. However, there are substantial ecolog- 
ical, social, and regulatory barriers and bottlenecks 
to seagrass restoration and conservation because 
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of the scale of the interventions required. We 
must work inclusively at a local scale but in a 
globally connected network. Advances in marine 
robotics, molecular ecology, remote sensing, and 
artificial intelligence offer new opportunities to 
solve conservation problems in difficult envi- 
ronments at unprecedented global scales. 
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RESEARCH 


Hidden selection against interbreeding 


oday, humans are the only extant members of our genus, Homo. This was not the case 

in the past, because we now know that our ancestors shared the planet with other Homo 

species. It has been suggested that selection against hybrid individuals would have 

acted against breeding across these species, but such a hypothesis is difficult to test 

today. To study this question, Vilgalys et al. took advantage of a decades-long dataset on 
two species of baboon from the Amboseli basin of Kenya. They found evidence of selection 
against hybrid, or admixed, ancestry similar to what has been predicted for ancestral homi- 
nids. Although evidence for selection against hybrids was clear, they also found that individual 
hybrids can thrive. —SNV_ Science, abm4917, this issue p. 635 


Baboons in the Amboseli basin of Kenya, such as the individual here, show signs of selection against genetic 


admixture between species. 


Shift from meiosis 
to mitosis 


In angiosperm plants, meiosis 
leads to a haploid but multicel- 
lular reproductive structure. 
Cairo et al. studied the molecular 


SCIENCE science.org 


regulation of the shift from 
meiosis to mitosis that generates 
this reproductive structure in the 
small mustard plant Arabidopsis. 
They found that a germline-spe- 
cific protein that drives meiotic 
exit is incorporated into P-bodies 
during the second meiotic 


division, where it transiently 
sequesters translation initiation 
factors, thus inhibiting translation. 
The authors suggest that this 
recognition process remodels the 
translatome and facilitates the 
transition to mitosis. —PJH 
Science, abo0904, this issue p. 629 


Repurposing against 
parasites 

Available drugs used to 

combat the apicomplexan 
parasites Toxoplasma gondii and 
Plasmodium falciparum have 
limited efficacy and undesirable 
side effects. Using a drug- 
repurposing screen, Swale et 

al. identified altiratinib, a drug 
originally developed to treat 
glioblastoma, as a potential 
antiparasitic drug. Through 
drug resistance screening and 
genetic target deconvolution, a 
T. gondii kinase involved in cell 
cycle progression was identi- 
fied as the primary target of 
altiratinib, which was confirmed 
for the corresponding kinase 

in P falciparum. Altiratinib 

was shown to globally disrupt 
protein splicing by targeting this 
family of parasite kinases. These 
findings support the further 
development of pan-apicom- 
plexan inhibitors that target this 
pathway. —CNF 

Sci. Transl. Med. 14, eabn3231 (2022). 


Continuous time crystals 
Time crystals are a new dynami- 
cal phase of quantum matter 
resulting from the breaking 
of time-translation symmetry 
and the subsequent interplay 
between interactions forming 
self-organized phases. To date, 
discrete time crystals have 
been observed in periodically 
driven systems. By contrast, 
Kongkhambut et al. report 
the observation of spontane- 
ous breaking of a continuous 
time translation symmetry 
in an atomic Bose-Einstein 
condensate inside a high- 
finesse optical cavity (see the 
Perspective by LeBlanc). Using 
a time-independent pump, the 
authors observed a limit cycle 
phase that was characterized 
by emergent periodic oscilla- 
tions of the intracavity photon 
number and was accompanied 
by the atomic density cycling 
through recurring patterns: a 
continuous time crystal. —ISO 
Science, abo3382, this issue p. 670; 
see also add2015, p.576 
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T CELLS 
Intestinal epithelium 
maintains T cells 


Immune surveillance of the 
intestinal epithelium is crucial 
for intestinal homeostasis and 
protecting against infection. 
Seo et al. used mouse knockout 
models to demonstrate that 
expression of herpes virus entry 
mediator (HVEM) on intestinal 
epithelial cells is crucial for the 
survival and patrolling ability 
of intraepithelial CD8« T cells. 
The protein LIGHT, a ligand for 
HVEM, was crucial for these 
HVEM-driven effects. Epithelial 
knockout of HVEM led to poorer 
protection against intestinal 
bacterial infection. These data 
show that HVEM expression 
in intestinal epithelial cells is 
involved in intraepithelial T cell— 
mediated immune surveillance 
of the small intestine. —DAE 

Sci. Immunol. 7,eabm6931 (2022). 


CRYSTALLIZATION 
Growing crystalline 
containers 


Hierarchical crystals are often 
found in nature, and biological 
processes delicately control 
their growth and patterning. 
For synthetic systems, this 
process remains challenging. 
For example, skeletal crystals 
with concave morphology 
usually require additives and 
complicated temperature 
control. Oki et al. succeeded in 
growing teacup-shaped micro- 
crystals in a controlled, uniaxial 
manner by casting a solution 
of a planar-chiral molecule 

on aclean quartz substrate 


Scanning electron image of a partially 
formed cup-shaped crystal 


616 


and evaporating it, causing 
supersaturation and crystal 
growth. The authors show that 
despite the overall geometrical 
complexity of the crystal, it 
emerges from the highly sym- 
metric ordering of the chiral 
planar molecules that stack 
on one another with a counter- 
clockwise rotation of 60° along 
acrystallographic sixfold screw 
axis. —MSL 

Science, abm9596, this issue p. 673 


OCEAN OXYGEN 
Deep concentration 


Between about 1.25 million 
and 800,000 years ago, the 
climate system went through a 
major change during a period 
called the Middle Pleistocene 
Transition. Was the inventory 
of dissolved oxygen in the 
ocean affected by this episode? 
Thomas et al. show that oxygen 
concentrations in glacial deep 
North Atlantic waters suffered 
a stepped reduction about 
900,000 years ago, coincident 
with reductions in the concen- 
tration of glacial atmospheric 
carbon dioxide and global ice 
volume. —HJS 

Science, abj7761, this issue p. 654 


ORGANIC CHEMISTRY 
Carbonyls to carbenes 


Carbenes are versatile chemi- 
cal intermediates because of 
their highly reactive divalent 
carbon centers. Unfortunately, 
typical carbene precursors 
such as diazo compounds are 
often sensitive and decompose 
explosively. Zhang et al. now 
report a comparatively safe 
protocol to generate carbenes 
from aldehydes, a plentiful and 
easily diversified substrate class 
(see the Perspective by West 
and Rousseaux). The sequence 
entails successive reactions of the 
aldehyde with pivaloyl chloride 
and zinc, followed by catalytic 
activation using iron, cobalt, or 
copper. Demonstrated applica- 
tions included wide-ranging 
examples of three-membered 
ring formation and sigma-bond 
insertion. —JSY 

Science, abo6443, this issue p. 649; 

see also abq8253, p.580 
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MACHINE LEARNING 
Machine learning for 
surface catalysis 


To date, only a tiny fraction 

of the enormous diversity of 
elementary surface reactions 
has been explored, yet it has 
already provided invaluable 
assistance in improving our 
understanding of heterogeneous 
catalysis. Microkinetic model- 
ing of heterogeneous catalysis 
(i.e., exploring a detailed list of 
elementary surface reactions) is 
challenging because it requires 
simultaneous consideration 

of many different aspects of 
complex interfacial processes. 
Shi et al. developed a compre- 
hensive machine learning—based 
approach for automated reaction 
pathway analysis and generated 
a detailed microkinetic model for 
carbon dioxide/carbon monoxide 
hydrogenation on copper—zinc 


Edited by Caroline Ash 
and Jesse Smith 


surfaces, which resolved several 
long-standing questions about 
this industrial process. The pro- 
posed approach can operate with 
thousands of reaction pathways 
and different surface coverages 
and is potentially applicable to 
other heterogeneous catalytic 
systems. —YS 

J.Am. Chem. Soc. 10.1021/ 

jacs.2c06044 (2022). 


MARINE SCIENCE 
Eavesdropping on whales 


Wildlife monitoring in the 
oceans can be challenging 
because of widespread under- 
sampling. Recent developments 
in using undersea fiber-optic 
cables for sensing provide a dif- 
ferent opportunity to solve this 
problem. Bouffaut et a/. were 
able to successfully monitor 
whales with distributed acoustic 


science.org SCIENCE 
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sensing using fiber-optic cables 
in the Arctic. The authors 
recorded vocalizations between 
baleen whales, estimated their 
locations, and used the vocal- 
izations for seafloor subsurface 
exploration. Their observations 
demonstrate the potential for 
using fiber-optic infrastructure 
to listen to the less-accessible 
parts of our oceans. —BG 
Front. Mar. Sci. 10.3389/ 
fmars.2022.901348 (2022). 


NEUROSCIENCE 
Memory formation 
in sleep 


Forming indirect associations 
between learned items with 
overlapping components is 
crucial for abstract problem 
solving. This type of learning 
is a fundamental feature of 
relational memory. Sleep is 


SCIENCE science.org 


BIOGEOGRAPHY 


Dispersal effects on island diversity 


slands that are smaller or farther from the main- 

land tend to have fewer species. Walentowitz et al. 

asked how this classic pattern varies across plant 

species with different modes of seed dispersal (i-e., 

self-dispersing or moved by animals, wind, or water) 
and whether it is influenced by human impacts. Using 
data from 54 small islands off the coast of Denmark, 
the authors found that area had stronger effects than 
isolation on plant species richness and composition. 
Larger islands had more species and more animal- and 
wind-dispersed species. Human activities increased 
species richness and may also increase seed dispersal 
between islands, effectively reducing their isolation, 
but did not erase the effects of island area. —BEL 


- 


= depending on island size. 
—— 


important for learning relational 
memory tasks. However, we 

do not know what biophysi- 

cal mechanisms enable this 
function. Using a biophysical 
model of a thalamocortical 
network, Tadros and Bazhenov 
tested the role of non-rapid 

eye movement sleep (during 
which the sleeper experiences 
slower muscular and brain 
activity) on the network's ability 
to perform a relational memory 
task. After periods of slow-wave 
sleep, the network could form 
indirect inferences that were 
never trained directly. Sleep 
replay increased connections to 
and from a shared conjunctive 
memory unit, which increased 
its performance during rela- 
tional memory tasks. These 
modeling studies produced pre- 
dictions that can now be tested 
by experiments. —PRS 


J. Neurosci. 42,5330 (2022). 


: oa Biogeogr. 10.1111/jbi.14454 (2022). 


_ The Danish archipelago displays a spectrum of plant biodiversity 


HEART REGENERATION 
Old hearts learn 
new tricks 


Aging-related diseases such 
as heart failure and other 
cardiovascular disorders are 
the leading causes of death in 
many countries, and they are 
becoming increasingly com- 
mon worldwide as the number 
of older people increases. The 
ability of the heart to produce 
new cardiomyocytes decreases 
with age, which makes it more 
difficult to repair damage and 
increases the risk of heart 
failure. However, a study by 
Lerchenmiller et al. suggests 
that exercise may offer some 
help in this regard even if 
started late in life. The authors 
had previously reported that 
voluntary exercise can stimulate 
the generation of cardiomyo- 
cytes in young adult mouse 


hearts, and now they have also 
observed this phenomenon in 
aged animals. —YN 
Circulation 10.1161/ 
CIRCULATIONAHA.121.057276 
(2022). 


POLITICAL ECONOMY 
Paying for patriots 


US federal government 
spending in the 1930s under 
Depression-era New Deal social 
programs “created a new geog- 
raphy of patriotism” manifested 
a short time later during World 
War Il. Caprettini and Voth show 
that government war bond pur- 
chases, volunteering for military 
service, and being awarded for 
heroism in the war, all of which 
entailed personal sacrifice for 
the country, were more com- 
mon in counties that received 
more per capita government 
relief under the New Deal. This 
work refines our understanding 
of social contracts and reci- 
procity between governments 
and constituents. -BW 
Q. J. Econ. 10.1093/ 
qje/qjac028 (2022). 


CELL BIOLOGY 
Fusexins found 
in archaea 


Fusexins are proteins that medi- 
ate cellular membrane fusion. 
These proteins were initially 
discovered in viruses, in which 
they merge the viral envelope 
with host cell membranes 
during invasion. In eukaryotes, 
fusexins mediate syncytial 
tissue development in worms 
and gamete fusion in plants and 
protists. Despite their wide- 
spread presence in eukaryotes, 
it is not known whether fusexins 
originated in the first eukary- 
otes and were later captured by 
viruses or the other way around. 
Moi et al. found that archaea 
contain the protein Fusexin1l, 
which is structurally similar to 
fusexins from viruses, plants, 
and animals and can promote 
cell-cell fusion. This discovery 
raises the possibility that gam- 
ete fusion proteins originated in 
archaeal cells. —SMH 

Nat. Commun. 13, 3880 (2022). 
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CORONAVIRUS 
A mosaic approach to 


protection 


The COVID-19 pandemic has 
been ongoing for more than 
2 years now, and new vari- 
ants such as Omicron are less 
susceptible to the vaccines 
developed against earlier 
lineages of severe acute respira- 
tory syndrome coronavirus 2 
(SARS-CoV-2). In addition, there 
is continued risk of spillovers 
of other animal sarbecoviruses 
into humans. There is thus a 
need for vaccines that will give 
broader protection. Cohen et al. 
developed mosaic nanoparticles 
that display the receptor- 
binding domains (RBDs) from 
SARS-CoV-2 and seven other 
animal sarbecoviruses. Mosaic 
nanoparticles protected against 
both SARS-CoV-2 and SARS- 
CoV challenges in animal models 
even though the SARS-CoV RBD 
was not present on the mosaic-8 
RBD nanoparticles. By con- 
trast, ahomotypic SARS-CoV-2 
RBD nanoparticle (presenting 
only SARS-CoV-2 RBDs) only 
protected against a SARS-CoV-2 
challenge. —VV 

Science, abq08339, this issue p.618. 


HEART DISEASE 
Aclose-up look at 
cardiomyopathies 


Cardiomyopathies are dis- 
eases of the heart muscle 

that interfere with its ability to 
pump blood effectively and can 
result in heart failure. They are 
divided into different categories 
based on the clinical presenta- 
tion. Reichart et al. performed 
single-nucleus RNA sequencing 
on heart samples from patients 
with cardiomyopathies with or 
without known genetic causes, 
as well as samples from controls 
without structural heart disease. 
The authors identified key cell 
types and their locations in the 
heart, cellular interactions, and 
biological signaling pathways, 
offering insights into the biology 


617-B 


of the heart in healthy and dis- 
eased states. —YN 
Science, abo1984, this issue p. 619 


DEVELOPMENT 
Drosophila embryo 
analyzed cell by cell 


Animal development can 
progress quite rapidly, with cel- 
lular lineages proliferating and 
differentiation status changing 
minute to minute. Calderon et 
al. have now visualized develop- 
ment in the fruit fly Drosophila 
in greater detail than ever 
before. Taking advantage of the 
ability to produce collections of 
Drosophila embryos that differ 
in developmental stage by only 
seconds or minutes, the authors 
have analyzed, ona single-cell 
basis, how chromatin acces- 
sibility and gene expression 
shift during Drosophila embryo- 
genesis. This single-cell atlas 
of Drosophila embryogenesis 
reveals cell lineages and their 
developmental relationships and 
links enhancer usage and gene 
expression. —BAP 

Science, abn5800, this issue p.620 


MEDICINE 
Mitochondrial protection 
from ischemia 


Damage caused by lack of oxy- 
gen, or ischemia, is an important 
consequence of heart attack and 
stroke. Kynurenic acid, a product 
of metabolism of the amino acid 
tryptophan, has shown protective 
effects against ischemia in animal 
models. Wyant et al. found that 
these effects are mediated by the 
heterotrimeric guanine nucleo- 
tide—binding protein-coupled 
receptor (GPCR) GPR35 (see the 
Perspective by Cadenas). Mouse 
hearts lacking GPR35 lost the 
beneficial effect when treated 
with kynurenic acid. In mouse 
neonatal cardiomyocytes stimu- 
lated with kynurenic acid, GPR35 
associated with mitochon- 

dria and appeared to interact 
indirectly with ATP synthase 
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inhibitory factor subunit 1, 
reducing the loss of ATP during 
ischemia. The findings further 
support the possible therapeutic 
use of GPR35 agonists to reduce 
the deleterious effects of isch- 
emia. —LBR 

Science, abm1638, this issue p. 621; 

see also add4629, p.579 


SYSTEMS BIOLOGY 
Cellular information 
processing 


Prior studies have shown that 
the responses of individual cells 
to growth factors are variable, 
raising the question of whether 
cells detect more informa- 
tion than the mere presence 
or absence of growth factor. 
Kramer et al. monitored the 
response of cells to various 
concentrations of epidermal 
growth factor. They found that 
responses are indeed variable, 
apparently because signaling 
nodes store other information 
about the state of the cell and 
thus respond differently to the 
same concentration of growth 
factor. Thus, cells have a way to 
process information in the con- 
text of their surroundings and 
current cell state. -LBR 
Science, abf4062, this issue p. 642 


IMMUNOLOGY 
The “hex” factor behind 
IEL development 


CD4*CD8aa* intraepithelial 
lymphocytes (CD4IELs) are 
aclass of intestinal innate- 

like T cells that contribute to 
various immune responses, 
including oral tolerance. Their 
development depends on 

the gut microbiota, but the 
precise antigens that these 

cells recognize have remained 
elusive. Bousbaine et al. report 
that B-hexosaminidase (8-hex), 
a highly conserved enzyme 
produced by commensals from 
the Bacteroidetes phylum, drives 
CDAIEL differentiation in the gut 
(see the Perspective by Shenoy 


and Koch). T cell receptors from 
both tissue-resident CD4IELs 
and regulatory T cells in gut- 
draining lymph nodes recognized 
B-hex peptides. B-hex—specific 
CDA T cells transferred into mice 
differentiated into CD4IELs that 
ocalized to the small intestine. 
There, they partially suppressed 
inflammation in a regulatory T 
cell-independent manner. This 
study highlights how intestinal 
immune responses to com- 
mensal bacteria can regulate 
inflammation. —STS 

Science, abg5645, this issue, p. 660; 

see also add7145, p.575 


MUSCLE REPAIR 
JMJD3 primes stem cells 
for inflammation 


After muscle injury, muscle 
stem cells must coordinate with 
immune cells in the inflamed 
tissue to ensure efficient 
repair. Nakka et al. identified an 
essential role for the epigen- 
etic enzyme KDM6B/JMJD3 in 
establishing the communica- 
tion between muscle stem cells 
and infiltrating immune cells 
during muscle repair (see the 
Perspective by Gabellini). They 
found that, in response to injury, 
removal of the transcriptionally 
repressive histone H3K27me? 
modification by KDM6B/JMJD3 
allows muscle stem cells to pro- 
duce hyaluronic acid that is then 
incorporated into the extracellu- 
lar matrix. This remodeling of the 
extracellular matrix allows the 
muscle stem cell to receive sig- 
nals from the infiltrating immune 
cells that initiate regeneration. 
—SMH and BAP 

Science, abm9735, this issue p. 666; 

see also add6804, p. 578 


CANCER 
CD9 tags tumorigenic 
packages from CAFs 


Extracellular vesicles (EVs) shed 
from pancreatic cancer—asso- 
ciated fibroblasts containing 
the membrane protein ANXA6 
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induce aggressive phenotypes in 
tumors. By analyzing EVs from 
cancer—associated fibroblasts 
from patient samples and mouse 
models of pancreatic ductal 
carcinoma, Nigri et al. found 
that the cell-surface glycopro- 
tein CD9 mediated the uptake 
of ANXA6-positive EVs. Either 
blocking CD9 on isolated EVs 
or inhibiting the kinase p38 in 
pancreatic ductal carcinoma 
cells prevented ANXA6-positive 
EVs from inducing cell migra- 
tion and a mesenchymal protein 
signature. —LKF 

Sci. Signal. 15, eabg8191 (2022). 


CANCER 
Engineered protein blocks 


cancer stem cells 


Glioblastoma multiforme (GBM) 
is the most lethal brain cancer 
in adults. At present, there is no 
cure for this devastating disease, 
and current therapies mini- 
mally increase overall survival. 
Benedetti et al. engineered 
SOX2, a key transcription factor 
implicated GBM malignancy, 
with the aim of rewiring a key 
oncogenic gene network. The 
authors show that this strategy 
can block tumor growth and 
cancer stem cell activities in 
vitro and in mouse xenografts of 
human tumors. Because Sox2 
has a primary role in promoting 
tumor development not just in 
GBM but also in lung, pros- 
tate, and breast cancers, this 
approach might be an innovative 
strategy against other deadly 
cancers. —LDC 
Sci. Adv. 10.1126/ 
sciadv.abn3986 (2022). 
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INTRODUCTION: Two animal coronaviruses from 
the severe acute respiratory syndrome (SARS)- 
like betacoronavirus (sarbecovirus) lineage, SARS 
coronavirus (SARS-CoV) and SARS-CoV-2, have 
caused epidemics or pandemics in humans in 
the past 20 years. SARS-CoV-2 triggered the 
COVID-19 pandemic that has been ongoing for 
more than 2 years despite rapid development 
of effective vaccines. Unfortunately, new SARS- 
CoV-2 variants, including multiple heavily 
mutated Omicron variants, have prolonged 
the COVID-19 pandemic. In addition, the dis- 
covery of diverse sarbecoviruses in bats raises 
the possibility of another coronavirus pandemic. 
Hence, there is an urgent need to develop vac- 
cines and therapeutics to protect against both 
SARS-CoV-2 variants and zoonotic sarbeco- 
viruses with the potential to infect humans. 


Mosaic-8b Homotypic SARS-2 Beta 


2 ae 


RATIONALE: To combat future SARS-CoV-2 var- 
iants and spillovers of sarbecoviruses threat- 
ening global health, we designed nanoparticles 
that present 60 randomly arranged spike 
receptor-binding domains (RBDs) derived from 
the spike trimers of eight different sarbeco- 
viruses (mosaic-8 RBD nanoparticles) to elicit 
antibodies against conserved and relatively 
occluded—rather than variable, immunodomi- 
nant, and exposed—epitopes. The probability 
of two adjacent RBDs being the same is low for 
mosaic-8 RBD nanoparticles, a feature chosen 
to favor interactions with B cells whose bivalent 
receptors can cross-link between adjacent RBDs 
to use avidity effects to favor recognition of 
conserved, but sterically occluded, RBD epi- 
topes. By contrast, nanoparticles that present 
60 copies of SARS-CoV-2 RBDs (homotypic 
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Mosaic RBD nanoparticle vaccination protects and elicits antibodies against conserved epitopes. Mosaic-8 
elicited broader cross-reactive responses than those of homotypic nanoparticles. In a stringent infection model 
[K18-human angiotensin-converting enzyme 2 (K18-hACE2)], both protected against matched challenge 
(SARS-CoV-2 Beta), but only mosaic-8 also protected against a mismatch (SARS-CoV). Mosaic-8-immunized NHPs 
were protected against mismatched SARS-CoV-2 (Delta) and SARS-CoV. Mosaic-8-elicited antibodies predominantly 
bound conserved epitopes, whereas homotypic-elicited antibodies predominantly bound variable epitopes. 
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RBD nanoparticles) are theoretically more 
likely to engage B cells with receptors that 
recognize immunodominant and sterically 
accessible, but less conserved, RBD epitopes. 


RESULTS: We compared immune responses 
elicited by mosaic-8 (SARS-CoV-2 RBD plus 
seven animal sarbecoviruses RBDs) and homo- 
typic (only SARS-CoV-2 RBDs) nanoparticles 
in mice and macaques and observed stronger 
responses elicited by mosaic-8 to mismatched 
(not represented with an RBD on nanopar- 
ticles) strains, including SARS-CoV and ani- 
mal sarbecoviruses. Mosaic-8 immunization 
produced antisera that showed equivalent 
neutralization of SARS-CoV-2 variants, includ- 
ing Omicron variants, and protected from both 
SARS-CoV-2 and SARS-CoV challenges in mice 
and nonhuman primates (NHPs), whereas 
homotypic SARS-CoV-2 immunization pro- 
tected from SARS-CoV-2 challenge but not 
from SARS-CoV challenge in mice. Epitope 
mapping of polyclonal antisera by using deep 
mutational scanning of RBDs demonstrated 
targeting of conserved epitopes after immu- 
nization with mosaic-8 RBD nanoparticles, in 
contrast with targeting of variable epitopes 
after homotypic SARS-CoV-2 RBD nanopar- 
ticle immunization, which supports the hy- 
pothesized mechanism by which mosaic RBD 
nanoparticle immunization can overcome im- 
munodominance effects to direct production 
of antibodies against conserved RBD epitopes. 
Given the recent plethora of SARS-CoV-2 var- 
iants that may be arising at least in part be- 
cause of antibody pressure, a relevant concern 
is whether more conserved RBD epitopes 
might be subject to substitutions that would 
render vaccines and/or monoclonal antibodies 
targeting these regions ineffective. This sce- 
nario seems unlikely because RBD regions 
conserved between sarbecoviruses and SARS- 
CoV-2 variants are generally involved in con- 
tacts with other regions of spike trimer and 
therefore less likely to tolerate selection- 
induced substitutions. 


CONCLUSION: Together, these results suggest 
that mosaic-8 RBD nanoparticles could pro- 
tect against SARS-CoV-2 variants and future 
sarbecovirus spillovers—in particular, high- 
lighting the potential for a mosaic nanopar- 
ticle approach to elicit more broadly protective 
antibody responses than those with homo- 
typic nanoparticle approaches. 
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Mosaic RBD nanoparticles protect against challenge 
by diverse sarbecoviruses in animal models 
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To combat future severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants and spillovers 
of SARS-like betacoronaviruses (sarbecoviruses) threatening global health, we designed mosaic 
nanoparticles that present randomly arranged sarbecovirus spike receptor-binding domains (RBDs) to 
elicit antibodies against epitopes that are conserved and relatively occluded rather than variable, 
immunodominant, and exposed. We compared immune responses elicited by mosaic-8 (SARS-CoV-2 and 
seven animal sarbecoviruses) and homotypic (only SARS-CoV-2) RBD nanoparticles in mice and 
macaques and observed stronger responses elicited by mosaic-8 to mismatched (not on nanoparticles) 
strains, including SARS-CoV and animal sarbecoviruses. Mosaic-8 immunization showed equivalent 
neutralization of SARS-CoV-2 variants, including Omicrons, and protected from SARS-CoV-2 and 
SARS-CoV challenges, whereas homotypic SARS-CoV-2 immunization protected only from SARS-CoV-2 
challenge. Epitope mapping demonstrated increased targeting of conserved epitopes after mosaic-8 
immunization. Together, these results suggest that mosaic-8 RBD nanoparticles could protect against 
SARS-CoV-2 variants and future sarbecovirus spillovers. 


wo animal coronaviruses from the sar- 

becovirus lineage, severe acute respira- 

tory syndrome coronavirus (SARS-CoV) 

and SARS-CoV-2 (hereafter SARS-1 and 

SARS-2, respectively), have caused epi- 
demics or pandemics in humans in the past 
20 years. SARS-2 triggered the COVID-19 pan- 
demic that has been ongoing for more than 
2 years despite rapid development of effective 
vaccines (7). Unfortunately, new SARS-2 var- 
iants of concern (VOCs), including the heavily 
mutated Omicron VOCs (2-7), have prolonged 
the COVID-19 pandemic. In addition, the dis- 
covery of diverse sarbecoviruses in bats—some 
of which bind the SARS-1 and SARS-2 entry 
receptor, angiotensin-converting enzyme 2 
(ACE2) (8-14)—raises the possibility of another 
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coronavirus pandemic. Hence, there is an 
urgent need to develop vaccines and thera- 
peutics to protect against both SARS-2 VOCs 
and zoonotic sarbecoviruses. 

Currently approved SARS-2 vaccines in- 
clude the viral spike (S) trimer (1), which is 
consistent with S being the primary target of 
neutralizing antibodies (15-24). A corona- 
virus S trimer mediates entry into a host cell 
after one or more of its receptor-binding do- 
mains (RBDs) adopt an “up” position that 
allows interactions with a host cell receptor 
(Fig. 1A). Many of the most potent neutraliz- 
ing antibodies against SARS-2 block binding 
of ACE2 to the RBD (16-20, 23-29), and RBD 
targeting has been suggested for COVID-19 
vaccine development (30). We classified neu- 
tralizing anti-RBD antibodies into four main 
classes (classes 1, 2, 3, and 4) on the basis of 
their epitopes and whether they recognized 
“up” and/or “down” RBDs on S trimers (26). 
The potent class 1 and class 2 antibodies to 
RBD, whose epitopes overlap with the ACE2 
binding footprint, recognize a portion of the 
RBD that exhibits high sequence variability 
between sarbecoviruses (26). By contrast, the 
epitopes of class 4 antibodies, and to a some- 
what lesser extent class 3 antibodies, map to 
more conserved, but less accessible, regions 
of sarbecovirus RBDs (Fig. 1A). Substitutions 
in the RBDs of VOCs and variants of interest 
(VOIs) are also less common in the class 4 
and class 3 regions (Fig. 1A and fig. S1), thus 
suggesting that a vaccine strategy designed 
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to elicit class 3, class 4, and class 1/4 [class 4- 
targeting antibodies that block ACE2 binding 
(31-33)] could protect against potentially 
emerging zoonotic sarbecoviruses as well as 
current and future SARS-2 variants. 

Here, we describe animal immunogenic- 
ity and virus challenge studies to evaluate 
mosaic-8 RBD nanoparticles, a potential pan- 
sarbecovirus vaccine in which RBDs from 
SARS-2 and seven animal sarbecoviruses (14) 
were covalently attached to a 60-mer pro- 
tein nanoparticle (34). The probability of 
two adjacent RBDs being the same is low 
for mosaic-8 RBD nanoparticles, an arrange- 
ment chosen to favor interactions with B cells 
whose receptors can cross-link between ad- 
jacent RBDs to use avidity effects (35) to 
preferentially recognize conserved, but steri- 
cally occluded, class 3, class 4, and class 1/4 
RBD epitopes (Fig. 1B). By contrast, homotypic 
SARS-2 RBD-mi3, a nanoparticle including 
60 copies of a SARS-2 RBD (34, 36), is more 
likely to engage B cells with receptors that 
recognize sterically accessible, but less con- 
served, class 1 and class 2 RBD epitopes (Fig. 1B). 
As previously observed (34), cross-reactivity 
against sarbecoviruses was more extensive 
in antisera from mosaic-8-immunized com- 
pared with homotypic SARS-2-immunized 
animals. In addition, although both mosaic-8- 
immunized and homotypic SARS-2-immunized 
animals were protected from SARS-2 challenge, 
only the mosaic-8-immunized animals were 
protected from SARS-1 challenge. Consistent 
with these results, polyclonal antibody epitope 
mapping by means of deep mutational scan- 
ning showed preferential binding to conserved 
RBD epitopes for mosaic-8 antisera, but bind- 
ing to more variable epitopes for homotypic 
antisera. These results highlight the potential 
for a mosaic nanoparticle approach to elicit 
more broadly protective antibody responses 
than homotypic nanoparticle approaches. 


Results 

Mosaic-8b and homotypic SARS-2 Beta RBD 
nanoparticles were homogeneous and exhibited 
expected properties 


We previously used SARS-2 WAI RBD for 
making mosaic-8 and homotypic SARS-2 RBD 
nanoparticles (34). For the challenge studies 
described here, we switched to the SARS-2 Beta 
RBD (three RBD substitutions compared with 
WAI BBD; https://viralzone.expasy.org/9556) 
to represent the most antigenically distinct 
variant at the time the experiments were ini- 
tiated (fig. S1). 

We used the SpyCatcher-SpyTag system (37, 38) 
to covalently attach RBDs with C-terminal 
SpyTag003 sequences to a 60-mer nanopar- 
ticle (SpyCatcher003-mi3) (39) to make either 
mosaic-8b (each nanoparticle presenting 
the SARS-2 Beta RBD plus seven other 
sarbecovirus RBDs attached to the 60 sites) 
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Fig. 1. Mosaic nanoparticles may preferentially induce cross-reactive 
antibodies through avidity effects. (A) (Left) Structure of SARS-2 S trimer 
(PDB ID: 6VYB) showing one “up” RBD (dashed circle). (Right) Sequence 
conservation of the 16 sarbecovirus RBDs in (D) calculated by use of the ConSurf 
Database (89) shown on four views of an RBD surface (PDB ID 7BZ5). The 
ACE2 binding footprint (PDB 6MOJ) is outlined with a yellow dotted line. 
Locations of residues that are substituted in SARS-2 VOCs and VOls as of 
March 2022 (https://viralzone.expasy.org/9556) are indicated as black dots. 
Class 1, 2, 3, 4, and 1/4 epitopes are outlined in different colored dotted lines by 
using information from structures of representative monoclonal antibodies 
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bound to RBD or S trimer (C102, PDB 7K8M; C002, PDB 7K8T; S309, PDB 7JX3; 
CR3022, PDB 7LOP; and C118, PDB 7RKV). The N-linked glycan attached to 
RBD residue 343 is indicated with teal spheres, and the potential N-linked 
glycosylation site at position 370 in RBDs derived from sarbecoviruses other 
than SARS-2 is indicated with a teal circle. (B) Schematic showing hypothesis for 
how mosaic RBD nanoparticles could induce production of cross-reactive 
antibodies because B cells with receptors that bind to multiple antigens on a 
nanoparticle have a competitive advantage over B cells with receptors that 
likely bind with a single Fab to a one or very few antigens on a nanoparticle. 
(Left) Clustered membrane-bound B cell receptors (only two shown for clarity) 
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bind with avidity to a strain-specific epitope (dark pink triangle) on dark pink 
antigens attached to a homotypic particle. (Middle) B cell receptors cannot bind with 
avidity to a strain-specific epitope (triangle) on a dark pink antigen attached to 

a mosaic particle; thus, only one Fab from a single strain-specific B cell receptor is 
likely to bind to a mosaic particle. (Right) Clustered B cell receptors (only two shown 
for clarity) can bind with avidity to common epitope (green circle) presented on 
different antigens attached to a mosaic particle, but not to strain-specific epitopes 


(fig. S1) or homotypic (each nanoparticle pre- 
senting 60 copies of the SARS-2 Beta RBD) 
RBD-mi3 nanoparticles (fig. S2A). In addi- 
tion to the SARS-2 Beta RBD, the other RBDs 
in mosaic-8b nanoparticles were chosen from 
clade la, 1b, and 2 sarbecoviruses [clades as 
defined in (13)] (Fig. 1, C and D, and figs. S1 
and S2A). Immune responses against these 
strains in immunized animals were consid- 
ered “matched” because each was represented 
by an RBD on mosaic-8b. Sarbecoviruses from 
clades 1, 2, and 3 and SARS-2 RBDs other than 
SARS-2 Beta that did not have RBDs repre- 
sented on mosaic-8b nanoparticles were con- 
sidered “mismatched” in immunological assays 
and challenge experiments. SARS-1 was chosen 
as a mismatched strain to allow challenge ex- 
periments and because it uses human ACE2 
as its host receptor (40) and can therefore be 
evaluated in ACE2-dependent pseudotyped 
neutralization assays, although SARS-1 is closely 
related to WIVI, a clade la bat sarbecovirus 
represented on the nanoparticles. Two ver- 
sions of mosaic-8 were used for experiments: 
mosaic-8b RBD-mi3 (SARS-2 Beta RBD and 
seven animal sarbecovirus RBDs) (Fig. 1, C 
and D) and mosaic-8 g (mosaic-8 with a WA1 
SARS-2 RBD plus the seven zoonotic RBDs) 
(34), in which N-linked glycosylation site 
sequons at RBD position 484 were introduced 
in the clade la and Ib RBDs to occlude class 1 
and 2 RBD epitopes (fig. S2A). RBD residue 
484 was chosen for N-glycan addition because 
substitutions at this position in SARS-2 RBD 
prevented binding of some human class 1 and 
class 2 antibodies to RBD (26). Mosaic-8 and 
homotypic SARS-2 Beta nanoparticles were 
purified by means of size-exclusion chroma- 
tography (SEC) (fig. S2B) and validated with 
SDS-polyacrylamide gel electrophoresis (SDS- 
PAGE) to show near 100% conjugation effi- 
ciency (fig. S2C). Dynamic light scattering (DLS) 
and negative-stain electron microscopy (EM) 
demonstrated that conjugated nanoparticles 
were monodisperse and exhibited a defined 
diameter (fig. S2, D and E), and interactions of 
human ACE2 and monoclonal antibodies with 
known epitopes exhibited expected binding 
profiles (fig. S3). 


RBD nanoparticles elicited binding and 
neutralizing antibody responses in 
K18-human ACE2 transgenic mice 


To compare the efficacies of mosaic and homo- 
typic RBD-mi3 nanoparticle immunizations, 
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we evaluated immune responses and protec- 
tion from viral challenge in K18-human ACE2 
(K18-hACE2) transgenic mice (Figs. 2 and 3) 
(41). K18-hACE2 mice express human ACE2 
driven by a cytokeratin promotor in epithelia— 
including airway epithelial cells, where SARS-2 
infections often start—and recapitulate severe 
COVID-19 upon infection with SARS-2 (47-43). 
Viral challenges of K18-hACE2 mice result in 
extensive weight loss, and death usually re- 
sults from SARS-2 or SARS-1 infection (42). 
We chose this lethal challenge model to eval- 
uate the highest levels of potential protection, 
which might then be used to extrapolate to the 
expected efficacy of a vaccine in humans. 

K18-hACE2 mice were primed with either 
mosaic-8b, mosaic-8 g, homotypic SARS-2 Beta, 
or unconjugated SpyCatcher-mi3 nanoparticles 
adjuvanted with AddaVax and boosted 4 weeks 
later (Fig. 2A). In these experiments, SARS-2 
Beta represented a matched sarbecovirus for 
the mosaic-8b RBD-mi3 and homotypic SARS-2 
RBD-mi3 immunogens but was mismatched 
for mosaic-8 g. SARS-1 was mismatched for all 
three nanoparticle immunogens. 

We first evaluated serum antibody responses 
in binding and pseudovirus neutralization 
assays 14 days after boosting (Fig. 2, B to I). 
Serum enzyme-linked immunosorbent assays 
(ELISAs) were conducted to assess binding to 
the indicated RBDs and to the 6P stabilized 
version of the soluble WA1 S trimer (44). All 
three inmunogens elicited high ELISA bind- 
ing antibody titers, with titers against the WA1 
RBD and S-6P modestly higher for homotypic 
SARS-2 Beta-immunized mice with respect to 
mosaic-8b- and mosaic-8 g-immunized mice 
(Fig. 2B). However, although the mosaic-8b and 
mosaic-8 g RBD-mi3 ELISA titers were not 
significantly different from homotypic RBD- 
mi3 antisera titers against the other SARS-2 
variants (Fig. 2, C and D), ELISA titers were in- 
creased against sarbecovirus RBDs derived from 
viruses other than SARS-2 (Fig. 2, Eand G 
to I)—for example, significantly higher bind- 
ing titers when comparing mosaic-8 antisera 
versus the homotypic antisera against SARS-1 
(mismatched) (Fig. 2E), SHC014 (matched 
for the two mosaic-8 immunogens) (Fig. 2G), 
BM48-31 (mismatched) (Fig. 2H), and Yun11 
(mismatched) (Fig. 21) (with the exception of 
mosaic-8 g and homotypic antisera in Fig. 2, 
F and H). As previously reported for RBD- 
mi3 immunogens (34), the trends for serum 
RBD binding were generally predictive of 
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(triangles). (C) Sarbecoviruses from which the RBDs in mosaic-8b RBD-mi3 were 
derived (matched) and sarbecoviruses from which RBDs were not included in 
mosaic-8b (mismatched). Clades are defined as in (13). The WA1 SARS-2 RBD was 
used in mosaic-8 g instead of the SARS-2 Beta RBD. (D) Phylogenetic tree of 
selected sarbecoviruses calculated by using PhyML 3.0 (90), based on amino acid 
sequences of RBDs aligned by using Clustal Omega (91). Viruses with RBDs included 
in mosaic-8b are indicated with gray rectangles. 


pseudovirus neutralization; however, the homo- 
typic SARS-2 Beta antisera showed signifi- 
cantly higher neutralization titers against 
all three SARS-2 variants than those of the 
mosaic-8b and mosaic-8 g antisera (Fig. 2, 
B to D). This contrasts with our previous re- 
ports of equivalent neutralization for antisera 
from mosaic-8 RBD-mi3- and homotypic 
SARS-2 RBD-immunized mice (34). In the 
earlier experiments, the SARS-2 RBD in both 
nanoparticles was derived from the WAI strain 
rather than the Beta VOC, which could elicit 
increased levels of potent class 2 antibodies 
to RBD that bind RBDs with residue Glu*** 
(E484), which is substituted in most SARS-2 
variants (45). Equivalent levels of antibodies 
to SARS-2 RBD binding, but lower neutral- 
ization potencies, in mosaic-8 versus homo- 
typic RBD antisera are consistent with a larger 
portion of non-neutralizing antibodies against 
SARS-2 induced through mosaic-8 immuni- 
zation. Non-neutralizing antibodies could be 
involved in protection because non-neutralizing 
antibodies have been shown to play a role in 
preventing severe COVID-19 from natural in- 
fection or challenges after vaccination (46, 47). 
Despite their reduced neutralization potencies 
against SARS-2, the mosaic-8b and mosaic-8 g 
antisera showed significantly higher neutraliza- 
tion titers than the homotypic antisera against 
clade la viruses such as SARS-1 (mismatched) 
(Fig. 2E) and WIV1 and SHCO14 (matched for 
mosaics) (Fig. 2, F and G), which is in agree- 
ment with earlier experiments (34). 


Mosaic-8b, but not homotypic SARS-2 Beta, 
RBD-mi3 immunizations protect against 
matched and mismatched viral challenges 


The four groups of immunized K18-hACE2 
mice (7 = 10) were challenged with SARS-2 
Beta or SARS-1 (Fig. 2A). Four mice per group 
were euthanized at 4 days after challenge for 
viral load analysis, and the remaining six mice 
were monitored for survival up to 28 days after 
challenge. Mice in each cohort were evaluated 
for weight loss, survival, and levels of viral 
genomic and subgenomic RNA, and infectious 
virus in lung tissue and oropharyngeal swabs 
(Fig. 3). Control animals immunized with un- 
conjugated mi3 showed rapid weight loss and 
death 4 to 6 days after SARS-2 or SARS-1 chal- 
lenge. As evaluated by relative weight loss, the 
mosaic-8b and homotypic SARS-2 Beta RBD- 
mi3 nanoparticles were equally protective 
against SARS-2 challenge, showing minimal to 
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Fig. 2. Mosaic-8b and homo- 
typic SARS-2 Beta RBD-mi3 
immunizations induced 
binding and neutralizing 
antibodies in K18 mice. 

(A) (Left) Immunization 
schedule. K18-hACE2 mice 
were immunized with either 

5 ug (RBD equivalents) 
mosaic-8b, mosaic-8 g, 
homotypic SARS-2 Beta, 

or the molar equivalent of 
unconjugated SpyCatcher003- 
mi3 nanoparticles. (Right) 
Structural models of mosaic-8 
and homotypic RBD-mi3 
nanoparticles constructed by 
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no weight loss, whereas some of the mosaic-8 g 
animals experienced transient weight loss but 
recovered by 10 days after challenge (Fig. 3A). 
By contrast, only the mosaic-8b and mosaic-8 g 
immunizations prevented weight loss after 
the SARS-1 challenge, whereas the homotypic 
SARS-CoV-2 Beta RBD-mi3 mice experienced 
similar weight loss as that of the mi3 control 
mice (Fig. 3A). The post-challenge survival 
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results were consistent with weight loss: After 
SARS-2 Beta challenge, both the mosaic-8b 
and homotypic immunized animals showed 
complete survival (100%), whereas five of 
six (~83%) animals in the mosaic-8 g im- 
munization group survived (Fig. 3B). After 
SARS-1 challenge, all but one of the mice in the 
homotypic SARS-2 Beta group reached end- 


point criteria within 6 days (a delay compared 
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with the control mi3 group, in which all ani- 
mals reached endpoint criteria within 4 days), 
whereas all mice in the mosaic-8b and mosaic- 
8 g group survived the challenge during the 
28 days of post-challenge monitoring (Fig. 3B). 

Altogether, despite elicitation of lower neu- 
tralization titers against SARS-2 compared with 
homotypic Beta RBD-mi3 antisera (Fig. 2, B to 
D), immunization with mosaic-8b RBD-mi3 
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Fig. 3. Mosaic-8b immunization protected against SARS-2 and SARS-1 
challenges in K18-hACE2 mice, whereas homotypic SARS-2 immunization 
protected only against SARS-2. Mice were immunized and boosted with the 
indicated mi3 nanoparticles (5 1g RBD equivalents for RBD nanoparticles; molar 
equivalent for unconjugated mi3 control). (A) Weight changes after SARS-2 Beta or 
SARS-1 challenge. Mean weight in each vaccinated cohort is indicated with a thick 
colored line. Weights of individual mice are indicated with colored dashed lines. 


was fully protective against both matched 
(SARS-2 Beta) and mismatched (SARS-1) chal- 
lenges in the K18-hACE2 mouse model. By con- 
trast, immunization with homotypic SARS-2 
RBD-mi3 was protective against the matched 
SARS-2 Beta challenge but not against the 
mismatched SARS-1 challenge. Mosaic-8 g 
immunization protected against SARS-1 chal- 
lenge but showed somewhat reduced efficacy 
against the SARS-2 Beta challenge, perhaps 
related to occluding the class 2 RBD epitope 
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targeted by potent, but usually strain-specific, 
neutralizing antibodies. 


Infectious virus levels and viral RNA copies 
correlate with protection in challenged 
K18-hACE2 mice 


We measured levels of infectious virus and 
viral RNA in lung and oropharyngeal swab 
samples from challenged K18-hACE2 mice 
(n = 4), obtained at 4 days after challenge 
(Fig. 3, C and D). Infectious virus titers were 
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(B) Survival after SARS-2 Beta or SARS-1 challenge. (C) (Left) SARS-2 Beta 
infectious titers after challenge in lung tissue and oropharyngeal swabs. (Right) 
Genomic and subgenomic SARS-2 Beta RNA copes determined with RT-PCR. 

(D) (Left) SARS-1 infectious titers after challenge in lung tissue and oropharyngeal 
swabs. (Right) Genomic and subgenomic SARS-1 RNA copies determined with 
RT-PCR. Significant differences between cohorts linked by horizontal lines are 
indicated with asterisks: *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. 


measured by determining the median tissue 
culture infectious dose (TCID;,) of either 
oropharyngeal swab or lung tissue homoge- 
nate samples as described (48). In SARS-2 
Beta-challenged mice, vaccination with either 
mosaic-8b or homotypic SARS-2 Beta com- 
pletely suppressed viral replication in the lungs 
and oropharyngeal swabs, whereas levels of in- 
fectious virus in mosaic-8 g-immunized animals 
were equivalent to that in control-immunized 


mice in lungs. The SARS-2 Beta infectious viral 
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load was lower for mosaic-8 g-immunized 
mice with respect to control animals in the 
oropharyngeal swabs, suggesting partial pro- 
tection in these animals (Fig. 3C, left). Almost 
all vaccinated animal groups displayed com- 
pletely suppressed infectious SARS-1 in lungs 
compared with that of control-immunized mice 
(Fig. 3D, left). However, only mosaic-8b- and 
mosaic-8 g-immunized animals showed com- 
plete suppression of infectious SARS-1 in oro- 
pharyngeal swabs, whereas homotypic SARS-2 
Beta-immunized animals showed infectious 
viral loads that were similar to those of con- 
trol animals (Fig. 3D, left), possibly explain- 
ing the severity of SARS-1 in these animals 
(Fig. 3, A and B). 

Viral RNA copies in the lung tissue and oro- 
pharyngeal swabs were measured with reverse 
transcription polymerase chain reaction (RT- 
PCR) by using both genomic and subgenomic 
primer sets. Genomic RT-PCR titers reflect the 
overall RNA copies in the sample, including 
both replicating virus in infected cells and 
viral particles and debris, whereas subge- 
nomic RNA is a better surrogate for infectious 
viral titers because it is produced in infected 
cells and poorly packaged into virions (49). 
In lung tissue samples, genomic and sub- 
genomic SARS-2 Beta viral RNA titers were 
lower in mosaic-8b-immunized and homo- 
typic SARS-2 Beta-immunized animals com- 
pared with control-immunized animals (Fig. 
3C, right), which is consistent with infectious 
virus titer measurements (Fig. 3C, left) and 
protection against SARS-2 Beta challenge in 
mosaic-8b-immunized and homotypic SARS-2 
Beta-immunized animals (Fig. 3, A and B). 
SARS-2 Beta RNA copies in oropharyngeal 
swabs were low for all immunized cohorts 
compared with the control (Fig. 3C, right). Sub- 
genomic SARS-1 viral RNA titers were com- 
pletely suppressed in mosaic-8b- and mosaic-8 g- 
immunized animals in both lung tissue and 
oropharyngeal swabs with respect to the con- 
trol (Fig. 3D, right), which also is consistent 
with infectious viral titers (Fig. 3C, left) and 
complete protection in these cohorts against 
SARS-1 challenge (Fig. 3, A and B). The lack of 
suppression of viral RNA and infectious viral 
loads in oropharyngeal swabs from homotypic 
SARS-2 Beta-immunized animals challenged 
with SARS-1 correlates with lethality from 
SARS-1 infection, possibly because of virus 
entry into the brain through nasal infection 
(50). Homotypic SARS-2-immunized animals 
showed significantly lower levels of subge- 
nomic viral RNA in the lung (Fig. 3D) with 
respect to control animals, suggesting partial 
control of SARS-1 in these tissues. 

The K18-hACE2 mouse experiments were 
designed to evaluate survival, weight loss, and 
reduction or absence of viral replication as the 
primary metrics for vaccine efficacy (50). How- 
ever, we also obtained lung tissue 4 days after 
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challenge for analysis by hematoxylin and eosin 
(H&E) staining and immunohistochemistry 
(IHC) (data S1 and fig. $4). For SARS-2 Beta 
infection, immunization with mosaic-8b and 
homotypic SARS-2 Beta nanoparticles showed 
minimal to no pathology as well as IHC stain- 
ing for SARS-2 N protein with respect to control 
animals and mosaic-8 g-vaccinated animals 
(fig. S3, A to H, and Q), which is consistent 
with viral infectious titers and viral RNA titers 
in the lung (Fig. 2C). SARS-1 infection lung 
pathology for all animals was minimal for 
both vaccinated and control animals, and 
only control animals showed IHC staining for 
SARS-1 N (fig. S4, I to P and Q), which again is 
consistent with viral infectious titers and viral 
RNA titers in the lung (Fig. 2D). 


Mosaic-8b RBD nanoparticles protect nonhuman 
primates from mismatched viral challenges 


To extend these results to another animal 
model of SARS-2 and SARS-1 infection, we also 
conducted immunization and challenge studies 
in nonhuman primates (NHPs). Because of 
limitations of available NHPs, we evaluated 
only the most promising vaccine candidate, 
mosaic-8b RBD-mi3, comparing it with a non- 
immunized cohort for challenges with either 
SARS-2 Delta or with SARS-1. Both challenge 
viruses were mismatched because the mosaic- 
8b nanoparticles did not include RBDs from 
SARS-2 Delta or SARS-1. 

Eight NHPs were immunized and boosted 
at day 28 with mosaic-8b RBD-mi3 adjuvanted 
with VAC20 [2% aluminum hydroxide wet 
gel (Al,03)] (alum) and then boosted again at 
day 92 with mosaic-8b RBD-mi3 adjuvanted 
with EmulsiPan, an MF59-like squalene-based 
oil in water emulsion adjuvant (57). Four weeks 
after the second boost, half of the NHPs in the 
vaccinated and control groups were challenged 
with SARS-2 Delta, and the other half were 
challenged with SARS-1 (Fig. 4:A). 

Polyclonal antisera were evaluated with 
ELISA for binding to SARS-2 VOCs and eval- 
uated for neutralization activity by use of 
pseudovirus and authentic virus neutralization 
assays (Fig. 4B). The RBD ELISA and neutrali- 
zation results showed similar trends, with rela- 
tively weak binding and neutralization before 
the first boost, rising levels after the first boost 
that contracted by day 92, and then rising again 
and remaining above the 1:100 neutralization 
titers that correlate with ~90% vaccine effi- 
ciency (52) after the second boost. Binding 
antibody levels were similar for SARS-2 Beta 
(matched) and the mismatched WAI, Delta, 
and Omicron BA.1, BA.2, BA.2.12.1, and BA.4/ 
BA.5 SARS-2 variant RBDs, which is predictive 
of neutralization across SARS-2 variants (34) as 
verified in available pseudovirus and authentic 
virus neutralization assays for SARS-2 variants 
(Fig. 4B). We also evaluated binding and neu- 
tralization against RBDs and pseudoviruses 
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from other sarbecovirus lineages (clades 1a, 2, 
and 3), including matched (WIV1 and SHCO014) 
and mismatched (SARS-1, LYRa3, RshSTT200, 
BM48-31, BtKY72, Khosta-2, and Yun11) viruses 
(Fig. 4, C and D). Similar trends were observed 
for binding and neutralization of non-SARS-2 
sarbecoviruses as seen for the SARS-2 variants: 
All RBDs were recognized by polyclonal anti- 
sera in ELISAs and neutralized in pseudovirus 
assays for human ACE2 entry-dependent viral 
strains for which neutralization assays could 
be conducted, including mismatched strains: 
SARS-1; a mutant form of BtKY72 (K493Y/ 
T498W) that uses ACE-2 for entry (73); and 
Khosta-2 and LYRa3 viruses, which contain 
chimeric spikes (RBDs from Khosta-2 or LYRa3, 
remaining S from SARS-1). In all cases, the 
contracting antibody responses at day 92 were 
restored by the second boost (Fig. 4, C and D). 

Mosaic-8b RBD-mi3-immunized and con- 
trol NHPs were challenged with either SARS-2 
Delta or SARS-1 (both mismatched) 28 days 
after the second boost (Fig. 4A). Protection 
was assessed by measuring infectious virus 
titers (Fig. 5, A and B) and by viral RNA using 
RT-PCR (SARS-2 only) (Fig. 5C) in bronchial 
alveolar lavage (BAL) and nasal swabs 2 or 
4 days after challenge. We observed no detect- 
able SARS-2 Delta infectious virus in BAL at 
either time point, whereas BAL from three of 
four control animals showed infectious SARS-2 
virus (Fig. 5A). Nasal swabs showed low levels 
of virus in vaccinated animals, but at about 
two log significantly reduced levels compared 
with those of control NHPs (Fig. 5A), which is 
consistent with reports of detectable virus rep- 
lication in upper airways in animals that were 
protected from clinical disease (53). For SARS- 
1-challenged animals, mosaic-8b-immunized 
NHPs had no detectable viral titers in either 
BAL or nasal swabs, whereas all control animals 
had detectable viral titers (three of four in BAL 
and one of four in nasal swabs; individual ani- 
mals identified by different colors) (Fig. 5B). 
Although the low sample numbers precluded 
statistical significance of BAL or nasal swab 
differences in immunized versus control chal- 
lenged animals, the lack of detectable SARS-1 
virus in mosaic-8b-immunized animals was 
suggestive of protection. 

Viral RNA copies were measured with RT- 
PCR in nasal swabs, BAL, and lung tissue taken 
from mosaic-8b-immunized and control ani- 
mals challenged with SARS-2 Delta (Fig. 3C). 
Both genomic and subgenomic RNA titers were 
significantly reduced in mosaic-8b-immunized 
animals in both BAL and nasal swabs (Fig. 3C), 
which is consistent with infectious virus titers 
(Fig. 3A). Genomic and subgenomic RNA copies 
in lung tissue were also reduced for mosaic- 
8b-immunized animals (Fig. 3C), which is 
consistent with the observed protection. Both 
vaccinated and unvaccinated animals chal- 
lenged with SARS-2 Delta showed a lack of 
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Fig. 4. Mosaic-8b RBD-mi3 immunization induced binding and neutralizing 
antibodies in NHPs. Mismatched RBDs or viruses with respect to mosaic-8b 
RBD-mi3 nanoparticles are indicated with cyan rectangular boxes. (A) (Left) 
Immunization schedule. NHPs were primed and boosted with 25 ug (RBD 
equivalents) mosaic-8b RBD-mi3 in alum and boosted again with 25 ug mosaic-8b 
RBD-mi3 in EmulsiPan. Eight immunized NHPs and eight unimmunized 

NHPs were then challenged with either SARS-2 Delta (four immunized and 
four unimmunized) or with SARS-1 (four immunized and four unimmunized). 
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(Right) Structural model of mosaic-8b RBD-mi3 nanoparticles as shown 

in Fig. 2A. (B to D) RBDs or viruses for assays indicated as different colors. 
ELISA and neutralization data for antisera from individual NHPs (open circles) 
presented as the mean (bars) and standard deviation (horizontal lines). 
ELISA results are shown as midpoint titers (ECs values); neutralization 
results are shown as half-maximal inhibitory dilutions (IDso values). 

Dashed horizontal lines correspond to the background values representing 
the limit of detection. 
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Fig. 5. Mosaic-8b immu- 
nization protected NHPs 
against SARS-2 Delta 
and SARS-1 challenges. 
NHPs were immunized 
with mosaic-8b RBD-mi3 
or not immunized 
(control) before challenge. 
(A) SARS-2 Delta infec- 
tious titers after challenge 
in BAL (left) and nasal 
swabs (right). Individual 
animals are denoted with 
different colors. (B) SARS-1 
infectious titers after chal- 
lenge in BAL (left) and 
nasal swabs (right). 
Individual NHPs in the 
unvaccinated control 
group are denoted with 
different colors to show 
that all four animals 
exhibited signs of detect- 
able SARS-1 infectious 
virus in BAL and/or nasal 
swabs. (C) (Top left) 
Genomic and subgenomic 
SARS-2 Beta RNA 

copies determined with 
RT-PCR in BAL day 2 and 
day 4 after infection. 

(Top right): Nasal swabs 
day 2 and day 4 after 
infection. (Bottom): 

Lung tissue day 4 after 
infection. Significant 
differences between 
cohorts linked by horizon- 
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significant pathology in the lungs (fig. S5 and 
data S2) and no detectable differences in cyto- 
kine and chemokine levels in BAL (fig. S6). 


Antisera to mosaic-8b target conserved RBD 
epitopes, whereas antisera to homotypic 
RBD-mi3 target variable epitopes 


We next assessed whether mosaic and homo- 
typic nanoparticles elicited different types of 
anti-RBD antibodies, as suggested by protection 
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against matched challenge for both mosaic-8b 
and homotypic SARS-2 RBD-mi3 cohorts but 
protection against mismatched challenge only 
for the mosaic-8b RBD-mi3 cohort. Immuni- 
zations with either mosaic-8b or homotypic 
SARS-2 Beta nanoparticles adjuvanted with 
AddaVax were conducted in BALB/c mice 
(prime and boost 3 weeks later) (fig. S7A). 
Antibodies to RBD in serum 4 weeks after boost 
evaluated with ELISA and neutralization (fig. 


2022 


S7, B to G) exhibited similar characteristics 
as seen in the immunized K18-hACE2 mice 
(Fig. 2). To compare the characteristics of 
antibodies elicited by each type of RBD nano- 
particle, we mapped SARS-2 Beta epitopes 
recognized by immunization-elicited anti- 
bodies to investigate whether mosaic-8b, but 
not homotypic SARS-2 Beta, preferentially 
elicited antibodies to RBD against conserved 
RBD epitopes as hypothesized (Fig. 1B). For 
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these experiments, we used yeast-display deep 
mutational scanning to map mutations in the 
SARS-2 Beta RBD (54, 55) that escaped binding 
by antisera raised in BALB/c mice immunized 
with either mosaic-8b or homotypic SARS-2 Beta 
RBD-mi3 nanoparticles (Fig. 6 and fig. S8 to S10). 

Mapping showed that mosaic-8b antisera 
from six immunized mice primarily targeted 
more conserved RBD epitopes, including 


Fig. 6. Antibodies elicited by mosaic-8b immuniza- 
tion map to conserved RBD epitopes, as compared 
with antibodies elicited by homotypic SARS-2 

Beta immunization. (A) Deep mutational scanning 
was used to identify mutations that reduced binding 

of sera from BALB/c mice immunized with mosaic-8b 
RBD-mi3 (top) or homotypic SARS-2 Beta RBD-mi3 
(bottom) to the SARS-2 Beta RBD. The y axis shows 
the site-total antibody escape (sum of the antibody 
escape of all mutations at a site), with larger numbers 
indicating more antibody escape. Each light gray line 
indicates one antiserum, and the heavy black lines 
indicate the average across the n = 6 sera per group. 
RBD sites 340 to 408 and 462 to 468, which include the 
more conserved class 3/4 epitopes, are indicated 

with solid gray lines, and sites 472 to 503, which include 
sites from the more variable class 1/2 epitopes, are 
indicated with dashed lines. The “conserved” and 
“variable” epitopes presented here were generalized for 
simple visualization and are not identical to more 
specific epitope-class definitions (26, 60). The highly 
variable RBD class 2 site 484 that is immunodominant 
among humans infected with SARS-2 (45, 60) and 

the subdominant class 4 sites 383 to 386 are labeled. 
(B) The average site-total antibody escape for mice 
immunized with mosaic-8b RBD-mi3 (top) or homotypic 
SARS-2 RBD-mi3 (bottom) mapped to the surface of 
the SARS-2 Beta RBD (PDB 7LYQ), with white indicating 
no escape and red indicating sites with the most 
escape. Key sites are labeled, all of which are class 3/4 
sites, except for the class 2 484 site. Interactive 

ogo plots and structure-based visualizations of the 
antibody-escape maps are at https://jbloomlab.github. 
io/SARS-CoV-2-RBD_Beta_mosaic_np_vaccine. 
Individual antibody-escape maps are in fig. S9; raw 
data are in data S3 and at https://github.com/ 
jbloomlab/SARS-CoV-2-RBD_Beta_mosaic_np_vaccine/ 
blob/main/results/supp_data/all_raw_data.csv. 

(C) (Top) Residues in a “down” RBD that contact 
other regions of spike shown in blue on an RBD surface 
(PDB 7BZ5). Interacting residues were identified by 
using the PDBePISA software server (https://www.ebi. 
ac.uk/pdbe/prot_int/pistart.html) and the RBD from 
chain A of the spike trimer structure in PDB 7M6E. 
(Middle) Variable to conserved sarbecovirus sequence 
gradient (dark pink, variable; green, conserved) shown 
on RBD surface as in Fig. 1A. (Bottom) Structure 

of SARS-2 S trimer (PDB 6VYB) showing “down” 

RBD (boxed) colored with the variable to conserved 
sarbecovirus sequence gradient. 
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class 4 (RBD residues 383 to 386) and class 3 
(residue 357) epitopes (Fig. 6A and fig. S9), 
whereas homotypic SARS-2 antisera primarily 
targeted variable RBD epitopes, particularly 
class 2 (residue K484) (Fig. 6B and fig. SQ). 
These results confirmed that mosaic-8b RBD- 
mi3 elicited antibodies against the conserved 
class 3 and class 4 epitopes, as designed in the 
mosaic RBD nanoparticle approach (Fig. 1, A 


and B). By contrast, homotypic SARS-2 RBD- 
mi3 primarily elicited antibodies against the 
more variable class 2 epitope (characterized by 
RBD residue 484) that varies between sarbe- 
coviruses and in SARS-2 VOCs (Fig. 1A). 

We also mapped mutations that reduce bind- 
ing of four serum samples from NHPs vacci- 
nated with three doses of mosaic-8b RBD-mi3 
(day 106) (Fig. 4). The NHP antibody-escape 
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profiles were relatively broad, suggesting that 
no single mutation had a disproportionately 
large effect on binding (fig. S10). The NHP sera 
showed some skewing toward class 4 RBD epi- 
topes and slight targeting of K484 (class 2) and 
T500 (class 3) (fig. S10). Differences in antibody- 
escape profiles for mosaic-8b-immunized mice 
and NHPs could be related to species differ- 
ences (56) and/or different immunization reg- 
imens (Figs. 2A and 4A). The broad escape 
profiles from NHP antisera may suggest either 
antibody binding to a broad set of RBD epi- 
topes and/or a population of affinity-matured 
antibodies that are less affected by single- 
point mutations. Nevertheless, the antibody- 
escape mapping results for mice and NHPs 
immunized with mosaic-8b RBD-mi3 are con- 
sistent with the hypothesis that the mosaic-8b 
nanoparticles elicit antibodies that target con- 
served RBD epitopes (Fig. 1B). 

Mapping of sera from mosaic-8b-immunized 
mice and NHPs demonstrated relatively low, 
but nonzero, targeting of variable epitopes— 
for example, typified by RBD residue 484. 
Immunization with mosaic-8 g RBD-mi3 nano- 
particles, in which class 1 and 2 epitopes in 
clade 1a and Ib RBDs were likely at least par- 
tially occluded by N-glycosylation at residue 484, 
was less protective against SARS-2 challenge 
than immunization with mosaic-8b, in which 
these epitopes were intact (Fig. 3). This re- 
sult implies that retaining at least a subset of 
antibodies that target the immunodominant 
class 1 and 2 epitopes may be important for 
protection against SARS-2 challenge. Strat- 
egies to occlude these epitopes by introducing 
N-glycans (57, 58) may thus impede optimal 
protection against SARS-2. 


Discussion 


Antibodies to RBD raised by infection and vac- 
cination can potently neutralize SARS-2 through 
blocking S trimer binding to the ACE2 receptor 
required for viral entry (16-20, 23-29). Although 
neutralizing antibodies recognize multiple RBD 
epitopes (26, 59), immunoglobulins G (IgGs) in 
human polyclonal plasmas tend to target the 
class 1 and class 2 epitopes that are under- 
going rapid evolution in SARS-2 and that vary 
between zoonotic and human sarbecoviruses 
(28, 45, 60). Some monoclonal antibodies 
against these epitopes maintain breadth (67) 
but more commonly show partial or complete 
loss of potency against SARS-2 VOCs and only 
rarely cross-react with animal sarbecoviruses 
(45, 62, 63). By contrast, although less common 
and usually less potent than antibodies against 
class 1 and class 2 anti-RBD epitopes, anti- 
bodies against the more conserved class 3, 
4, and 1/4 epitopes exhibit increased cross- 
reactivity across sarbecoviruses and SARS-2 
VOCs (27, 31-33, 64, 65). Therefore, a vaccine 
that elicits such antibodies could serve to pro- 
tect against SARS-2, its variants, and emerging 
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zoonotic sarbecoviruses without the need for 
updating in the event of new VOCs and/or 
another sarbecovirus epidemic or pandemic. 

Homotypic SARS-2 RBD or S trimer nano- 
particles elicit potent neutralizing antibody 
responses that exhibit some degree of cross- 
reactivity across SARS-2 variants and sarbe- 
coviruses (34, 36, 66-75). We reproduced results 
for homotypic nanoparticles and extended 
them in challenge studies comparing protec- 
tion against SARS-2 and SARS-1 conferred by 
a mosaic-8 RBD nanoparticle versus a homo- 
typic SARS-2 RBD nanoparticle. We demon- 
strated protection from SARS-2 challenge in 
animals immunized with homotypic SARS-2 
RBD-mi3 and with mosaic-8b RBD-mi3 (chal- 
lenges for the mosaic-8b included both a 
matched SARS-2 variant and a mismatched 
variant), despite mosaic-8b containing one- 
eighth as many SARS-2 RBDs as that of its 
homotypic SARS-2 counterpart. These results 
suggest that a mosaic RBD nanoparticle could 
be used as a COVID-19 vaccine option to pro- 
tect from current and future SARS-2 variants. 
In addition, mosaic-8b, but not homotypic 
SARS-2 RBD-mi3 nanoparticles, protected 
K18-hACE2 mice against lethality with a mis- 
matched SARS-1 challenge, suggesting that a 
mosaic nanoparticle vaccine could also protect 
from disease caused by future mismatched 
zoonotic sarbecoviruses. 

Regarding the potential for mismatched pro- 
tection conferred by immunization with the 
homotypic SARS-2 RBD-mi3 nanoparticles, 
although the homotypic nanoparticles did not 
confer complete protection against SARS-1 in 
K18-hACE2 mice, viral loads in lung tissue ob- 
tained from immunized animals were reduced 
compared with those from control animals. 
Thus, only one of four animals had detectable 
viral loads in the lungs compared with four of 
four in the control group, demonstrating that 
some level of protection was achieved. A similar 
outcome was reported upon vaccination of 
aged mice with RBD-scNP, a homotypic SARS-2 
RBD-conjugated ferritin nanoparticle that in- 
duced neutralizing antibodies against SARS-2 
and preemergent sarbecoviruses: Upon subse- 
quent challenge with mouse-adapted SARS-1, 
viral loads were significantly reduced, but 
not absent, in lung tissue (four of five vacci- 
nated animals exhibited reduced but detectable 
SARS-1 virus, as compared with five of five with 
higher viral loads in the control group) (75). 

The presence of immunodominant epitopes 
on viral antigens have contributed to preventing 
the development of a universal influenza vac- 
cine and vaccines against other antigenically 
variable viruses, such as HIV-1 and hepatitis C 
(76). Presentation of related, but antigenically 
different, viral antigens on a mosaic nanoparticle, 
a method to subvert immunodominance (77), 
is appropriate for making a pan-sarbecovirus 
vaccine because the SARS-2 S trimer contains 
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immunodominant epitopes within its RBD 
that limit the breadth of antibodies elicited 
by SARS-2 infection or vaccination. Another 
potential advantage of a mosaic approach is 
that presentation of conserved T cell epitope 
peptides across the RBDs present on a mosaic 
nanoparticle could facilitate increased protec- 
tion from disease that results from infection 
with a mismatched sarbecovirus. Our demon- 
strations that a mosaic RBD protected against 
challenges from both matched and mismatched 
sarbecoviruses, as compared with homotypic 
SARS-2 RBD nanoparticles that protected fully 
only against a matched challenge, are con- 
sistent with RBD mapping experiments showing 
that mosaic-8b but not homotypic SARS-2 
RBD-mi3 nanoparticles primarily elicited anti- 
bodies against conserved RBD regions rather 
than the immunodominant class 1 and class 2 
RBD epitopes. By including eight different 
RBD antigens arranged randomly on a 60-mer 
nanoparticle, as compared with a smaller num- 
ber of different RBDs that are not arranged 
randomly (78), the chances of stimulating pro- 
duction of cross-reactive antibodies against 
conserved regions is maximized because ad- 
jacent antigens are unlikely to be the same 
(77). We recently described isolation of cross 
RBD-binding monoclonal antibodies from 
mosaic-8-immunized mice that included 
potent neutralizing antibodies against class 1/4: 
and class 3 RBD epitopes, thus also demon- 
strating targeting of conserved epitopes at the 
level of individual antibodies (79). 

The plug-and-display approach facilitated 
by SpyCatcher-SpyTag methodology (37, 80) 
allows the straightforward production of mosaic 
nanoparticles with different RBDs attached 
randomly. Such nanoparticles could be used to 
protect against COVID-19 and future sarbeco- 
virus spillovers and easily adapted to make 
other pan-coronavirus vaccines—for exam- 
ple, against Middle East respiratory sydrome 
(MERS)-like betacoronaviruses and/or against 
alpha- or deltacoronaviruses. Given the recent 
plethora of SARS-2 VOCs and VOIs that may 
be arising at least in part because of antibody 
pressure, a relevant concern is whether more 
conserved RBD epitopes might be subject to 
substitutions that would render ineffective 
vaccines and/or monoclonal antibodies that 
target these regions. Although direct proof 
remains to be established, this scenario seems 
unlikely because RBD regions conserved be- 
tween sarbecoviruses and SARS-2 variants 
are generally involved in contacts with other 
regions of spike trimer (Fig. 6C) and there- 
fore less likely to tolerate selection-induced 
substitutions. 


Materials and methods 
Protein expression 


Mammalian expression vectors encoding the 
RBDs of SARS-2 Beta (GenBank QUT64557.1), 
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SARS-2 WAI (GenBank MN985325.1), SARS-2 
Delta (GenBank QWK65230.1), SARS-2 Omi- 
cron BA.1 (GenBank UFO69278.1), SARS-2 Omi- 
cron BA.2 (GenBank UFO69279.1), SARS-2 
Omicron BA.2.12.1 (GenBank UMZ92892.1), 
SARS-2 Omicron BA.4/BA.5 1 (GenBank 
UPP14409.1), RaTG13-CoV (GenBank QHR63300), 
SHC014-CoV (GenBank KC881005), Rs4081- 
CoV (GenBank KY417143), pangolin17-CoV 
(GenBank QJIA48632), RmYN02-CoV (GSAID 
EPI_ISL_412977), Rfl-CoV (GenBank DQ412042), 
WI1V1-CoV (GenBank KF367457), Yunl1-CoV 
(GenBank JX993988), BM-4831-CoV (GenBank 
NC014470), BtkY72-CoV (GenBank KY352407), 
Khosta-2-CoV (QVN46569.1), RsSTT200-CoV 
(EPI_ISL_852605), LYRa3-CoV (AHX37569.1) 
and SARS-1 (GenBank AAP13441.1) with an 
N-terminal human IL-2 or Mu phosphatase 
signal peptide were constructed as previously 
described (34, 87). 5 of the 8 RBD genes (SARS-2 
WAI, RaTG13, pang17, WIVI1, and SHC0O14) 
used to make mosaic-8 g were altered by site- 
directed mutagenesis to include a potential 
N-linked glycosylation site (PNGS) (N at posi- 
tion 484 and T at position 486). Each RBD 
was expressed to include a C-terminal hexa- 
histidine tag (G-HHHHHH) and SpyTag003 
(RGVPHIVMVDAYKRYK) (39) (for coupling 
to SpyCatcher003-mi3) or only a 15-residue 
Avi-tag (GLNDIFEAQKIEWHE) followed by 
a 6xHis tag (for ELISAs). RBDs were purified 
from transiently-transfected Expi293F cell 
(Gibco) supernatants by Ni-NTA and SEC as 
described (87), and RBDs with an introduced 
PNGS used for making mosaic-8 g RBD-mi3 
were compared to their counterpart RBDs by 
SDS-PAGE to verify addition of extra N-glycans. 
SEC RBD fractions identified by SDS-PAGE 
were pooled and stored at 4°C or frozen in 
liquid nitrogen and stored at -80°C for longer 
term storage. A soluble SARS-2 trimer with 6P 
stabilizing mutations (44) was expressed and 
purified as described (26). Monoclonal human 
IgGs and human ACE-2 fused to human IgG 
Fc (hACE2-Fc) was expressed and purified as 
described (26, 32). 


Preparation of RBD-mi3 nanoparticles 


SpyCatcher003-mi3 nanoparticles (80) were 
expressed in BL21 (DE3)-RIPL Escherichia coli 
(Agilent) transformed with the pET28a His6- 
SpyCatcher003-mi3 gene (Addgene) as de- 
scribed (34, 82). Briefly, transformed bacterial 
cell pellets were lysed in the presence of 2.0 mM 
PMSF (Sigma). Lysates were spun at 21,000 x g 
for 30 min, filtered with a 0.2 um filter, and mi3 
particles were isolated by Ni-NTA chroma- 
tography using a HisTrap™ HP column (GE 
Healthcare). Eluted particles were concentrated 
using an Amicon Ultra 15 ml 30K concen- 
trator (MilliporeSigma) and SEC purified 
using a HiLoad® 16/600 Superdex® 200 (GE 
Healthcare) column equilibrated with 25 mM 
Tris-HCl pH 8.0, 150 mM NaCl, 0.02% NaN3 
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(TBS). SpyCatcher003-mi3 particles were stored 
at 4°C for up to 1 month and used for conju- 
gations after 0.2 um filtering or spinning at 
21,000 x g for 10 min. 

Purified SpyCatcher003-mi3 nanoparticles 
were incubated with a 2-fold molar excess 
(RBD to mi3 subunit) of SpyTagged RBD (either 
a single RBD for homotypic SARS-2 RBD par- 
ticles or an equimolar mixture of eight RBDs 
for mosaic particles) overnight at room tem- 
perature in TBS. The nanoparticles included 
the following RBDs: SARS-2 Beta (homotypic 
RBD-mi3); SARS-2 Beta, RaTG13, SHCO14, 
Rs4081, RmYNO2, pang17, Rfl, and WIV1 
(mosaic-8b RBD-mi3); and N-glycan modified 
versions of the clade la and 1b RBDs (SARS-2 
WAI, RaTG13, SHCO14, pang17, and WIV1) 
together with unmodified Rs4081, RmYNO2, 
and Rfl RBDs (mosaic-8 g RBD-mi3). For 
mosaic-8 mi3 nanoparticles, equivalent conju- 
gation of each of the eight SpyTagged RBDs 
was verified as described by SEC and SDS- 
PAGE analysis of conjugations to make homo- 
typic nanoparticles (34). 

Conjugated RBD-mi3 particles were sepa- 
rated from free RBDs by SEC on a Superose 6 
10/300 column (GE Healthcare) equilibrated 
with PBS (20 mM sodium phosphate pH 7.5, 
150 mM NaCl) and fractions corresponding to 
conjugated RBD-mi3 and free RBD were iden- 
tified by SDS-PAGE. Concentrations of conju- 
gated mi3 particles were determined using a 
Bio-Rad Protein Assay and are reported based 
on RBD content. 

RBD-mi3 nanoparticles were evaluated for 
binding to a human ACE2-Fc construct (32) 
and to human monoclonal antibodies that 
recognize known RBD epitopes by ELISA. 
Duplicate samples of 20 ul of a 2.5 ug/ml sol- 
ution of a purified RBD-mi3 nanoparticle in 
0.1M NaHCO; pH 9.8 was coated onto Nunc® 
MaxiSorp 384-well plates (Sigma) and incu- 
bated overnight at 4°C. After blocking with 
3% bovine serum albumin (BSA) in TBS con- 
taining 0.1% Tween 20 (TBS-T) for 1 hour at 
room temperature, plates were washed with 
TBS-T, and purified hACE2-Fc or human IgG 
(50 ug/ml with 8 4-fold serial dilutions in TBS- 
T/3% BSA) was added to plates for 3 hours at 
room temperature. Plates were then washed 
again for 1 hour at room temperature, and a 
1:100,000 dilution of secondary HRP-conjugated 
goat anti-human IgG (Abcam) was added. 
SuperSignal ELISA Femto Maximum Sensi- 
tivity Substrate (ThermoFisher) was added to 
plates following manufacturer instructions, and 
plates were read at 425 nm. For the homotypic 
SARS-2 Beta ELISA shown in fig. S3F, 50 ul of 
a 2.5 ug/ml solution of a purified RBD-mi3 
nanoparticle in 0.1 M NaHCO; pH 9.8 was 
coated onto Corning® 96 well plates (Sigma) 
and incubated overnight at 4°C. After blocking 
with 3% bovine serum albumin (BSA) in TBS 
containing 0.1% Tween 20 (TBS-T) for 1 hour 
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at room temperature, plates were washed with 
TBS-T, and purified hACE2-Fc or human IgG 
(50 ug/ml with 8 serial dilutions 4-fold in TBS- 
T/3% BSA) was added to plates for 3 hours at 
room temperature. Plates were then washed 
again for 1 hour at room temperature, and a 
1:10,000 dilution of secondary HRP-conjugated 
goat anti-human IgG (Abcam) was added. 1-step 
Ultra TMB-ELISA (ThermoFisher) was added 
to plates following manufacturer instructions, 
and plates were read at 450 nm. 

Prior to shipping for immunization and chal- 
lenge studies, aliquots of conjugated RBD-mi3 
nanoparticles were frozen in liquid nitrogen 
and then lyophilized (36) in PBS pH 7.4 using 
a Labconco CentriVap Benchtop Concentrator 
at -4°C. For immunizations, distilled water 
was added to rehydrate to a concentration of 
1 mg/ml for a working stock, and the solu- 
tion was gently pipetted and then spun at 
20,000 x g for 10 min to remove aggregates. 


DLS and EM characterizations of 
RBD-mi3 nanoparticles 


DLS was used to determine the hydrodynamic 
radii of conjugated nanoparticles. Lyophilized 
nanoparticles were rehydrated as described 
above. Sample sizes of 100 pl were loaded into 
a disposable cuvette, and DLS measurements 
were performed on a DynaPro NanoStar (Wyatt 
Technology) using settings suggested by the 
manufacturer. A fit of the second order auto- 
correlation function to a globular protein model 
was used to derive the hydrodynamic radius 
and plotted on Graphpad Prism 9.3.1. 

Mosaic-8b RBD-mi3 and homotypic SARS-2 
RBD-mi3 were compared by negative-stain EM. 
Ultrathin, holey carbon-coated, 400 mesh Cu 
grids (Ted Pella, Inc.) were glow discharged 
(60 s at 15 mA), and a 3 ul aliquot of SEC- 
purified RBD-mi3 nanoparticles were diluted 
to ~40-100 ug/ml and applied to the grids 
for 60 s, Grids were then negatively stained 
with 2% (w/v) uranyl acetate for 30 s. Images 
were collected with a 120 keV FEI Tecnai T12 
transmission electron microscope at 42,000x 
magnification. 


K18-hACE2 mice 


The Institutional Animal Care and Use Com- 
mittee at Rocky Mountain Laboratories pro- 
vided animal study approvals, which were 
conducted in an Association for Assessment 
and Accreditation of Laboratory Animal Care- 
accredited facility, following the basic princi- 
ples and guidelines in the Guide for the Care 
and Use of Laboratory Animals eighth edition, 
the Animal Welfare Act, US Department of 
Agriculture, and the US Public Health Service 
Policy on Humane Care and Use of Laboratory 
Animals. 

Animals were kept in climate-controlled 
rooms with a fixed light/dark cycle (12 hours/ 
12 hours). Mice were cohoused in rodent cages, 
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fed a commercial rodent chow with ad libitum 
water, and monitored at least once daily. The 
Institutional Biosafety Committee (IBC)- 
approved work with infectious SARS-1 and 
SARS-2 viruses was conducted under biosafety 
level 3 (BSL3) conditions. All sample inactiva- 
tion was performed according to IBC-approved 
standard operating procedures for removal of 
specimens from high containment. 


Cells and virus for K18-hACE2 mouse studies 


Virus propagation was performed in VeroE6 
cells in DMEM containing 2% FBS, 1 mM 
L-glutamine, penicillin (50 U/ml), and strepto- 
mycin (50 ug/ml (DMEM2). The consensus 
sequence of the virus stock (SARS-CoV-2 Beta, 
isolate hCoV-19/USA/MD-HP01542/2021) used 
for these experiments was identical to the ini- 
tial sequence deposited on GISAID (EPI_ISL_ 
890360), and no contaminants or additional 
mutations were detected. VeroE6 cells were 
maintained in DMEM supplemented with 10% 
fetal bovine serum, 1 mM L-glutamine, penicillin 
(50 U/ml), and streptomycin (50 ug/ml). VeroE6 
cells were provided by R. Baric (University of 
North Carolina at Chapel Hill). Mycoplasma 
testing is performed at monthly intervals, and 
no mycoplasma was detected. 


Vaccination and infection of K18-hACE2 mice 


K18-hACE2 mice (4 to 6 weeks old) were vac- 
cinated with 2 x 50 ul of 5 ug (RBD equiva- 
lents)/(11.4 ug of total RBD-mi3) of RBD-mi3 
or 5 pg unconjugated mi3 adjuvanted with 
Addavax 1:1 (1:1 v/v) intramuscularly at day 0 
and day 28 (and challenged 28 days post the 
second immunization). Fourteen days before 
virus challenge, animals were bled via the sub- 
mandibular vein. 10 animals per group were 
challenged with 30 ul of 10° TCIDs9 SARS-2/ 
human/USA/MD-HP01542/2021) or SARS-1 
(Tor2) diluted in sterile Dulbecco’s modified 
Eagle’s medium (DMEM). Weight was recorded 
daily. Six mice per group were observed for 
survival up to 28 days post challenge or until 
they reached end-point criteria. End-point 
criteria were as follows: labored breathing 
or ambulatory difficulties or weight loss ex- 
ceeding 20%. Four animals per group were 
euthanized on day 4 post challenge to collect 
oropharyngeal swabs and lung tissue for 
virology and histology analysis. 


Virus titration after K18-hACE2 mouse challenge 


Lung tissue sections were weighed and ho- 
mogenized in 750 ul of DMEM. Virus titrations 
were performed by end point titration in 
VeroE6 cells expressing transmembrane pro- 
tease serine 2 (TMPRSS-2) and human ACE2 
(BEI resources, NR-54970), which were in- 
oculated with 10-fold serial dilutions of virus 
swab medium or tissue homogenates in 96-well 
plates. When titrating tissue homogenate, cells 
were washed with PBS and 100 ul of DMEM2. 
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Cells were incubated at 37°C and 5% COs, and 
cytopathic effect (CPE) was assessed 6 days 
later. 


RNA extraction and quantitative RT-PCR 


RNA was extracted from oropharyngeal swabs 
swabs using a QJAamp Viral RNA kit (Qiagen) 
according to the manufacturer’s instructions. 
Tissue was homogenized and extracted using 
the RNeasy kit (Qiagen) according to the 
manufacturer’s instructions. Viral gRNA- and 
sgRNA-specific assays (48) were used for the 
detection of viral RNA. The RT-PCR reaction 
(with 5 pl template viral RNA) was performed 
using the QuantStudio (Thermo Fisher Scien- 
tific) according to instructions of the manufac- 
turer. Dilutions of SARS-2 with known genome 
copies were run in parallel to be used to gen- 
erate the standard curves. 


Histopathology and immunostaining of 
K18-hACE2 mouse samples 


Lungs were collected upon necropsy on day 4 
post challenge and perfused with 10% neutral- 
buffered formalin. Fixation was done for at 
least 7 days. Tissues were placed in cassettes 
and processed with a Sakura VIP-6 Tissue 
Tek on a 12-hour automated schedule using a 
graded series of ethanol, xylene, and PureAffin. 
Embedded tissues were sectioned at 5um and 
dried overnight at 42°C prior to staining. Sec- 
tions were stained with Harris hematoxylin 
(Cancer Diagnostics, no. SH3777), decolorized 
with 0.125% HCl/70% ethanol, blued in Pureview 
PH Blue (Cancer Diagnostics, no. 167020), coun- 
terstained with eosin 615 (Cancer Diagnos- 
tics, no. 16601), dehydrated, and mounted in 
Micromount (Leica, no. 3801731). An anti- 
SARS-2 nucleocapsid protein rabbit antiserum 
(generated by GenScript) was used at a 1:1000 
dilution to detect specific anti-SARS-2 immu- 
noreactivity using the Discovery ULTRA auto- 
mated staining instrument (Roche Tissue 
Diagnostics) with a Discovery ChromoMap 
DAB (Ventana Medical Systems) kit. All slides 
were examined by a board-certified veterinary 
anatomic pathologist who was blinded to study 
group allocations. Scoring was done as follows. 
H&E; no lesions = 0; less than 1% = 0.5; min- 
imal (1 to 10%) = 1; mild (11 to 25%) = 2; mod- 
erate (26 to 50%) = 3; marked (51 to 75%) = 4; 
severe (76 to 100%) = 5. IHC attachment; 
none = 0; less than 1% = 0.5; rare/few (1 to 
10%) = 1; scattered (11 to 25%) = 2; moderate 
(26 to 50%) = 3; numerous (51 to 75%) = 4; 
diffuse (76 to 100%) = 5. Histopathology report 
is summarized in data S1. 


BIOQUAL Ethics Statement and Animal Exposure 


Rhesus macaques were housed and cared for 
at BIOQUAL, Inc., Rockville, MD. The study 
was performed under a BIOQUAL-approved 
IACUC protocol (no. 21-092P), in strict accord- 
ance with the recommendations in the Guide 
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for the Care and Use of Laboratory Animals of 
the NIH, and in accordance with BIOQUAL 
standard operating procedures. BIOQUAL is 
fully accredited by the Association for Assess- 
ment and Accreditation of Laboratory Animal 
Care (AAALAC) and through OLAW, assurance 
number A-3086. All animal procedures were 
done under anesthesia to minimize pain and 
distress, in accordance with the recommenda- 
tions of the Weatherall report “The use of non- 
human primates in research.” Teklad 5038 
primate diet was provided once daily accord- 
ing to macaque size and weight. The diet was 
supplemented daily with fresh fruit and vege- 
tables. Fresh water was given ad libitum. 


Vaccination of NHPs 


The study included 16 rhesus macaques (Macaca 
mulatta), 8 of which were immunized with 
mosaic-8b RBD-mi3 (7 = 8), and 8 of which 
served as unimmunized controls for SARS-2 
and SARS-1 challenges. Four immunized and 
four unimmunized control NHPs were chal- 
lenged with SARS-2, and four immunized and 
four unimmunized control NHPs were chal- 
lenged with SARS-1. Due to a shortage of avail- 
able NHPs, we could not compare mosaic-8b 
RBD-mi3 and homotypic SARS-2 Beta RBD- 
mi3 immunizations in this study. Macaques 
were 3 to 5 years old and ranged from 3.2 to 
5.1 kg in body weight. Male and female ma- 
caques per group were balanced. Studies were 
performed unblinded. Macaques were eval- 
uated by BIOQUAL veterinary staff before, 
during, and after immunizations. 

NHPs were immunized intramuscularly with 
25 ug (calculated based on RBDs; 56.8 ug of 
total RBD-mi3) of mosaic-8b RBD-mi3 adju- 
vanted with VAC20 (2% aluminum hydroxide 
wet gel, Al,O3) (alum) (Prime and Boost 1) (kind 
gift of Francis Laurent and Ruben Caputo, SPI 
Pharma) and subsequently with EmulsiPan 
adjuvant (an MF59-like, squalene-based oil 
in water emulsion adjuvant) (57) (kind gift of 
Harshet Jain, Panacea Biotec) for Boost 2. 
Each macaque received 0.5 ml into the right 
forelimb. 


SARS-2 and SARS-1 intranasal and intratracheal 
NHP challenges 


All macaques were challenged at week 11 
(3 weeks after last vaccination) through com- 
bined intratracheal (1.0 ml) and intranasal 
(0.5 ml per nostril) inoculation with an infec- 
tious dose of 10° TCIDs9 of SARS-2 B.1.617.2 
(Delta, BEI NR-55612) or SARS-1 (Urbani). 
Virus was stored at —80 °C before use, thawed 
by hand and placed immediately on wet 
ice. Stock was diluted to 5 x 10* TCIDs9 ml? 
in PBS and vortexed gently for 5 s before 
inoculation. Nasal swabs, BAL, plasma, and 
serum samples were collected 7 days before 
and 2 and 4 days after challenge. Protec- 
tion from SARS-2 and SARS-1 infection was 
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determined by quantitative infectious viral 
load assay (TCID; 9), and for SARS-2, also by 
RT-PCR of subgenomic N RNA (N sgRNA) as 
described above except that amplification 
was done use the Applied Biosystems 7500 
Sequence detector. 


TCIDs9 and SARS-2 and SARS-1 virus PRNTso 
assays in NHP samples 


PRNT;, (50% plaque reduction neutralization 
test) assays for NHP samples were performed 
in a biosafety level 3 facility at BIOQUAL, Inc. 
(Rockville, MD). The TCIDs9 assay was con- 
ducted by addition of 10-fold graded dilutions 
of samples to Vero/TMPRSS2 cell monolayers. 
Serial dilutions were performed in cell culture 
wells in quadruplicates. Positive (virus stock of 
known infectious titer in the assay) and neg- 
ative (medium only) control wells were in- 
cluded in each assay set-up. The plates were 
incubated at 37°C, 5.0% CO, for 4 days. The 
cell monolayers were visually inspected for 
CPE, complete destruction of the monolayer. 
TCID;, values was calculated using the Reed- 
Muench formula (83). For samples that had 
less than 3 CPE positive wells, the TCID;, 
could not be calculated using the Reed-Muench 
formula, and these samples were assigned a 
titer of below the limit of detection (<2.7 log1O 
TCID;,/ml). For acceptable assay performance, 
the TCID;9 value of the positive control tested 
within 2-fold of the expected value. 

To measure neutralization activity, sera from 
each NHP were diluted to 1:10 followed by a 
3-fold serial dilution. Diluted samples were 
then incubated with ~30 plaque-forming units 
of wild-type SARS-2 USA-WA1/2020 (BEI NR- 
52281), B.1.351 (Beta, 501Y.V2.HV, NR-54974), 
or B.1.617.2 (Delta, BEI NR-55612) variants, in 
an equal volume of culture medium for 1 hour 
at 37°C. The serum-virus mixtures were added 
to a monolayer of confluent Vero E6 cells and 
incubated for one hour at 37°C in 5% COs. 
Each well was then overlaid with culture me- 
dium containing 0.5% methylcellulose and 
incubated for 3 days at 37°C in 5% CO,. The 
plates were then fixed with methanol at -20°C 
for 30 min and stained with 0.2% crystal violet 
for 30 min at room temperature. PRNTs9 were 
estimated by determining the dilution at which 
plaques were reduced by 50% with respect to 
viral control. 


Histopathology of SARS-2 infected NHPs 


At necropsy, lung was collected and placed 
in 10% neutral buffered formalin for histo- 
pathologic analysis. Tissue sections were 
trimmed and processed for H&E-stained slides 
and examined by a board-certified pathologist 
at Experimental Pathology Laboratories (EPL) 
in Sterling, Virginia. Histopathologic findings 
are presented in the Individual Data Listing 
of Histopathology tables. Findings were graded 
from 1 to 5, depending upon severity, and sum- 


Cohen et al., Science 377, eabq0839 (2022) 


marized by treatment group in the Incidence 
Summary of Microscopic Findings by Sacrifice. 
An explanation of descriptive severity grades 
is provided at the end of the tables; equivalent 
numbered grades are 1 = minimal, 2 = mild, 
3 = moderate, 4 = marked, 5 = severe. 


Luminex chemokine and cytokine assays 


Samples were analyzed using a Cytokine/ 
Chemokine/Growth Factor 37-Plex NHP 
ProcartaPlex Panel (EPX370-40045-901). Tar- 
get list [bead region] were as follows: 

T helper: GM-CSF [44], IFN gamma [43], 
IL-1 beta [18], IL-2 [19], IL-4 [20], IL-5 [21], 
IL-6 [25], IL-8 (CXCL8) [27], IL-10 [28], IL- 
12p70 [34], IL-13 [35], IL-17A (CTLA-8) [36], 
IL-18 [66], IL-23 [63], TNF alpha [45] 

Cytokines: CD40L [74], G-CSF (CSF-3) [42], 
IFN alpha [48], IL-IRA [38], IL-7 [26], IL-15 [65] 

Chemokines: BLC (CXCL13) [15], Eotaxin 
(CCL11) [33], IP-10 (CXCL10) [22], I-TAC 
(CXCLI1) [46], MCP-1 (CCL2) [51], MIG (CXCL9) 
[54], MIP-1 alpha (CCL3) [12], MIP-1 beta 
(CCL4) [47], SDF-1 alpha [13] 

Growth Factors: BDNF [57], FGF-2 [75], NGF 
beta [55], PDGF-BB [77], SCF [39], VEGF-A [78], 
VEGF-D [53] 

BAL samples were thawed at room temper- 
ature, quickly vortexed for 10s, and then cen- 
trifuged to pellet and remove debris. Samples 
were analyzed in duplicate using the above 
panel according to the manufacturer’s pro- 
tocol. Briefly, 200 ul of bead mix solution was 
added to all wells and washed using a Bio-Plex 
Pro Wash Station (BIORAD). 25 ul of Universal 
Assay Buffer was added to all wells, and then 
25ul of samples/standards/blanks (blanks uti- 
lized Universal Assay Buffer) was added to the 
relevant wells, after a 2hr incubation on an 
orbital shaker (400 to 500 rpm) at room tem- 
perature. Plates were then washed on the 
magnetic plate washer, and 25 ul of prepared 
detection antibody was added to all wells 
following a 30m incubation, with agitation at 
room temperature. Plates were washed again, 
and 50 ul of provided Streptavidin-PE solution 
was added to all wells following a 30m incu- 
bation with agitation at room temperature. 
Once the incubation was complete, the plates 
were again washed using the magnetic plate 
washer. Inactivation was then performed over- 
night using 100 ul per well of 10% Formalin 
solution (VWR). Plates were run on a Bio- 
Plex 200 System (BIORAD), which was pre- 
programmed according to kit specifications 
for optimal signal detection. Values out of 
range (OOR<) are designated with the lowest 
standard cut off point, based on manufacturer’s 
standard concentrations. 


Mouse and NHP serum ELISAs 


20 ul of a 2.5 ug/ml solution of an affinity pu- 
rified His-tagged RBD in 0.1 M NaHCO; pH 9.8 
was coated onto Nunc MaxiSorp 384-well plates 
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(Sigma) and incubated overnight at 4°C. After 
blocking with 3% bovine serum albumin (BSA) 
in TBS containing 0.1% Tween 20 (TBS-T) 
for 1 hour at room temperature, plates were 
washed with TBS-T, and mouse or NHP serum 
diluted 1:100 and then serially diluted by 4-fold 
with TBS-T/3% BSA was added to the plates 
for 3 hours at room temperature. Plates were 
then washed again for 1 hour at room temper- 
ature, and a 1:50,000 dilution of secondary 
HRP-conjugated goat anti-mouse IgG (Abcam) 
was added. SuperSignal ELISA Femto Max- 
imum Sensitivity Substrate (ThermoFisher) 
was added to plates following manufacturer’s 
instructions, and plates were read at 425 nm. 
Curves were plotted and analyzed to obtain 
midpoint titers (ECs, values) using Graphpad 
Prism 9.3.1 (Graphpad Software, San Diego, 
CA) assuming a one-site binding model with 
a Hill coefficient. Titer differences were eval- 
uated for statistical significance between 
groups using analysis of variance (ANOVA) 
test followed by Tukey’s multiple compari- 
son post hoc test calculated using Graphpad 
Prism 9.3.1. 


Mouse and NHP serum pseudovirus 
neutralization assays 


Lentiviral-based SARS-2 variants (WAI, Beta, 
Delta, Omicron BA.1, BA2, BA.2.12.2, and BA.4/ 
BA.5), SARS-1, WIV1, SHCO14, and BtKY72 
K493Y/T498W (73) (kind gift of A. Walls and 
D. Veesler, University of Washington) pseu- 
doviruses were prepared as described (20, 84) 
using genes encoding S protein sequences lack- 
ing C-terminal residues in the cytoplasmic tail: 
21 amino acid deletions for SARS-2 variants, 
WIV1, SHCO14, and BtKY72 and a 19 amino acid 
deletion for SARS-CoV. LYRa3 and Khosta-2 
pseudoviruses were made as described (85) 
using genes encoding chimeric spikes contain- 
ing the RBD from LYRa3 (residues 330-506) or 
Khosta-2 (residues 324 to 500) and the remain- 
ing portions of the spike protein from SARS-1 
(substituting for SARS-1 RBD residues 323 to 
501). For neutralization assays, threefold serially 
diluted sera from immunized mice or NHPs 
were incubated with a pseudovirus for 1 hour 
at 37°C, then the serum-virus mixture was 
added to 293T acre target cells and incubated 
for 48 hours at 37°C. Media was removed, cells 
were lysed with Britelite Plus reagent (Perkin 
Elmer), and luciferase activity was measured 
as relative luminesce units (RLUs). Relative 
RLUs were normalized to RLUs from cells in- 
fected with pseudotyped virus in the absence 
of antiserum. Half-maximal inhibitory dilutions 
(IDso values) were derived using 4-parameter 
nonlinear regression in AntibodyDatabase 
(86). Statistical significance of titer differences 
between groups were evaluated using ANOVA 
test followed by Tukey’s multiple comparison 
post hoc test of IDso values converted to loglO 
scale using Graphpad Prism 9.3.1. 
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Mouse serum samples for RBD epitope mapping 
Animal procedures and experiments were 
performed at Labcorp Drug Development 
(formerly Covance, Inc.) according to proto- 
cols approved by the IACUC to obtain serum 
samples for RBD epitope mapping experiments. 
Immunizations of mosaic-8b or homotypic 
SARS-2 Beta (5 ug each based on RBD content, 
11.4 pg of total RBD-mi3) in 100 ul of 50% v/v 
AddaVax adjuvant (InvivoGen) were done using 
intramuscular (IM) injections of 7-8-week-old 
female BALB/c mice (Envigo) (8 animals per 
cohort). Animals were boosted 3 weeks after 
the prime with the same quantity of antigen 
in adjuvant. Animals were bled under anes- 
thesia approximately every 2 weeks via orbital 
sinus and then euthanized 7 weeks after the 
prime (Day 49) after blood collection from the 
jugular vein. Blood samples were stored at 
room temperature in serum separator tubes 
(BD Microtainer) to allow clotting. Serum 
was then harvested into microtubes (Mikro- 
Schraubrohre) and stored at -80°C until use. 


RBD sequencing library construction and 
SARS-2 enrichment 


To construct sequencing libraries for RBD epi- 
tope mapping of mouse sera, 25 ul of ds-cDNA 
was brought to a final volume of 53 ul in elu- 
tion buffer (Agilent Technologies) and sheared 
on a Covaris LE220 (Covaris) to generate an 
average size of 180 to 220 base pairs (bp). The 
following settings were used: peak incident 
power, 450 W; duty factor, 15%; cycles per burst, 
1000; and time, 300 s. The Kapa HyperPrep kit 
was used to prepare libraries from 50 ul of 
each sheared cDNA sample following modifica- 
tions of the Kapa HyperPrep kit (version 8.20) 
and SeqCap EZ HyperCap Workflow (version 2.3) 
user guides (Roche Sequencing Solutions Inc.). 
Adapter ligation was performed for 1 hour at 
20°C using the Kapa Unique-Dual Indexed 
Adapters diluted to 1.5 uM concentration 
(Roche Sequencing Solutions Inc.). After liga- 
tion, samples were purified with AmPure XP 
beads (Beckman Coulter) and subjected to 
double-sided size selection as specified in the 
SeqCap EZ HyperCap Workflow User’s Guide. 
Precapture polymerase chain reaction (PCR) 
amplification was performed using 12 cycles, 
followed by purification using AmPure XP 
beads. Purified libraries were assessed for 
quality on the Bioanalyzer 2100 using the 
High-Sensitivity DNA chip assay (Agilent Tech- 
nologies). Quantification of pre-capture libra- 
ries was performed using the Qubit dsDNA 
HS Assay kit and the Qubit 3.0 fluorometer 
following the manufacturer’s instructions 
(Thermo Fisher Scientific). 

The myBaits Expert Virus bait library was 
used to enrich samples for SARS-2 accord- 
ing to the myBaits Hybridization Capture for 
Targeted NGS (version 4.01) protocol. Briefly, 
libraries were sorted according to estimated 
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genome copies and pooled to create a com- 
bined mass of 2 ug for each capture reaction. 
Depending on estimated genome copies, two 
to six libraries were pooled for each capture 
reaction. Capture hybridizations were per- 
formed for 16 to 19 hours at 65°C and sub- 
jected to 8 to 14 PCR cycles after enrichment. 
SARS-2-enriched libraries were purified and 
quantified using the Kapa Library Quant Uni- 
versal quantitative PCR mix in accordance 
with the manufacturer’s instructions. Libraries 
were diluted to a final working concentration 
of 1 to 2 nM, titrated to 20 pM, and sequenced 
as 2 x 150 bp reads on the MiSeq sequencing 
instrument using the MiSeq Micro kit version 2 
(lumina). 


Sorting of yeast libraries to identify mutations 
that reduced binding by polyclonal antisera 


Plasma mapping experiments were performed 
in biological duplicate using the independent 
mutant RBD libraries as previously described 
(45). Prior to the yeast-display deep muta- 
tional scanning experiments, 100 ul of each 
serum sample was heat-inactivated at 56°C 
for 30 min and twice-depleted of nonspecific 
yeast-binding antibodies by incubating with 
50 OD units of AWY101 yeast containing an 
empty vector (55). Mutant yeast libraries that 
were pre-sorted for RBD expression and ACE2 
binding (55) were induced to express RBD in 
galactose-containing synthetic defined medium 
with casamino acids (6.7 g/liter Yeast Nitrogen 
Base, 5.0 g/L Casamino acids, 1.065 g/liter MES 
acid, and 2% w/v galactose + 0.1% w/v dextrose). 
16 to 18 hours post-induction, cells were washed 
and incubated with plasma at a range of dilu- 
tions for 1 hour at room temperature with gentle 
agitation. For each plasma, we chose a sub- 
saturating dilution such that the amount of 
fluorescent signal due to plasma antibody bind- 
ing to RBD was approximately equal across 
samples. The exact dilution used for each 
plasma is given in fig. S8B. The libraries were 
washed and secondarily labeled for 1 hour with 
1:100 fluorescein isothiocyanate-conjugated 
anti-MYC antibody (Immunology Consultants 
Lab, CYMC-45F) to label for RBD expres- 
sion and 1:200 Alexa Fluor-647-conjugated 
goat anti-human-IgG Fc-gamma (Jackson 
ImmunoResearch 109-135-098) to label for 
bound NHP antibodies or Alexa Fluor-647- 
conjugated goat anti-mouse-IgG Fc-gamma 
(Jackson ImmunoResearch 115-605-008) to 
label for bound mouse antibodies. A flow 
cytometric selection gate was drawn to capture 
RBD mutants with reduced antibody binding 
for their degree of RBD expression (fig. S8C). 
For each sample, 7.5 x 10° to 1.1 x 10” cells were 
processed on the BD FACSAria II cell sorter 
(fig. S8C). Antibody-escaped cells were grown 
overnight in synthetic defined medium with 
casamino acids (6.7 g/liter Yeast Nitrogen Base, 
5.0 g/L Casamino acids, 1.065 g/liter MES acid, 
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and 2% w/v dextrose + 100 U/ml penicillin + 
100 ug/ml streptomycin) to expand cells prior 
to plasmid extraction. 


DNA extraction and Illumina sequencing 


Plasmid samples were prepared from 30 opti- 
cal density (OD) units [1.6 x 10° colony forming 
units (CFUs)] of pre-selection yeast populations 
and approximately 5 OD units (~3.2 x 10” CFUs) 
of overnight cultures of plasma-escaped cells 
(Zymoprep Yeast Plasmid Miniprep ID) as pre- 
viously described (54, 87). The 16-nucleotide 
barcode sequences identifying each RBD 
variant were amplified by polymerase chain 
reaction (PCR) and prepared for Nlumina se- 
quencing as described (54, 87). Barcodes were 
sequenced on an Illumina HiSeq 2500 with 50 bp 
single-end reads. Raw sequencing data are 
available on the NCBI SRA under BioProject 
PRJNA770094, BioSample SAMN26315988. 


Analysis of deep sequencing data to compute 
each mutation’s escape fraction 


Escape fractions were computed essentially 
as described in (54) and exactly as described 
in (55). We used the dms_variants package 
(https://jbloomlab.github.io/dms_variants, 
version 0.8.10) to process Illumina sequences 
into counts of each barcoded RBD variant in 
each pre-selection and antibody-escape popu- 
lation. We computed the escape fraction for 
each barcoded variant using the deep sequenc- 
ing counts for each variant in the original 
and plasma-escape populations and the total 
fraction of the library that escaped antibody 
binding via the formula in (55). These escape 
fractions represent the estimated fraction of 
cells expressing that specific variant that falls 
in the escape bin, such that a value of 0 means 
the variant is always bound by plasma and a 
value of 1 means that it always escapes plasma 
binding. 

We then applied a computational filter to 
remove variants with >1 amino-acid mutation, 
low sequencing counts, or highly deleterious 
mutations that might escape antibody binding 
due to poor RBD expression or folding as de- 
scribed (55). The reported antibody-escape scores 
are the average across duplicate libraries; these 
scores are also in data S3. Correlations in final 
single-mutant escape scores are shown in fig. 
S8D. Full documentation of the computational 
analysis is at https://github.com/jbloomlab/ 


SARS-CoV-2-RBD_Beta_mosaic_np_vaccine. 


Data visualization 


The static logo plot visualizations of the escape 
maps in the paper figures were created using the 
dmslogo package (https://jbloomlab.github.io/ 


dmslogo, version 0.6.2) and in all cases the 


height of each letter indicates the escape frac- 
tion for that amino-acid mutation calculated 
as described above. For the mouse sera, the 
static logo plots feature any site where for 
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>=1 serum, the site-total antibody escape was 
>10x the median across all sites and at least 
10% the maximum of any site. Due to the rel- 
ative breadth of the NHP sera, a more sensitive 
threshold for displaying sites on logo plots was 
used: we include any site where the site-total 
antibody escape is >5x the median across all 
sites and at least 5% maximum of any sites. 
This resulted in sites 383, 386, and 500. Thus, 
sites 346, 352, 357, 369, 378, 384, 385, 390, 396, 
408, 462, 468, 4:77, 478, 484, 485, 486, and 501 
were also added to the logo plots to facilitate 
comparison to the mouse sera. For each sam- 
ple, the y-axis is scaled to be the greatest of 
(a) the maximum site-wise escape metric ob- 
served for that sample, or (b) 20x the median 
site-wise escape fraction observed across all 
sites for that plasma. The code that generates 
these logo plot visualizations is available at 


https://github.com/jbloomlab/SARS-CoV-2- 


RBD_Beta_mosaic_np_vaccine/blob/main/ 
results/summary/escape_profiles.md. In many 
of the visualizations, the RBD sites are cat- 
egorized by epitope region (23) and colored 
accordingly. Specifically, we define the class 1 
epitope as residues 403+4.05+406+417+420+ 
421+453+455-4604+473-478+486+487+489+ 
503+504, the class 2 epitope as residues 472+ 
479+483-485+490-495, the class 3 epitope 
to be residues 341+3454+346+354-357+396+ 
437-452466-468+496+498-501, and the class 4 
epitope as residues 365-390+408+462. 


For the structural visualizations in figures, 


the Beta RBD surface [Protein Data Bank 
(PDB) ID 7LYQ] was colored by the site-wise 
escape metric at each site, with white indicating 
no escape and red scaled to be the same max- 
imum used to scale the y-axis in the logo plot 
escape maps, determined as described above. 
We created interactive structure-based visual- 
izations of the escape maps using dms-view (88) 


that are available at https://jbloomlab.github. 
io/SARS-CoV-2-RBD_Beta_mosaic_np_vaccine. 
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INTRODUCTION: Human heart failure is a highly 
morbid condition that affects 23 million indi- 
viduals worldwide. It emerges in the setting of 
an array of different cardiovascular disorders, 
which has propelled the notion that diverse 
stimuli converge on a common final pathway. 
Consistent with this, initiating etiologies do 
not direct heart failure treatments, which are 
often inadequate and necessitate mechanical 
interventions and cardiac transplantation. 
The recent application of single-nucleus RNA 
sequencing (snRNAseq) transcriptional analy- 
ses to characterize the cellular composition and 
molecular states in the healthy adult human 
heart provides an emerging benchmark by 
which disease-related changes can be assessed. 


Moreover, the discovery of human pathogenic 
variants that cause dilated cardiomyopathy 
(DCM) and arrhythmogenic cardiomyopathy 
(ACM), disorders associated with high rates 
of heart failure, provides direct opportunities 
to evaluate whether genotype influences heart 
failure pathways. 


RATIONALE: A systematic identification of shared 
and distinct molecules and pathways involved 
in heart failure is lacking, and knowledge of 
these fundamental data could propel the devel- 
opment of more effective treatments. To enable 
these discoveries, we performed snRNAseq of 
explanted ventricular tissues from 18 healthy 
donors and 61 heart failure patients. By focus- 
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Genotype-stratified analyses of heart failure at the single-nuclei level. The transcriptomes of 


881,081 nuclei from 61 heart failure patients were p 


rofiled and compared with the transcriptional 


signatures of 18 healthy controls. Genotype-stratified analyses of cell types and cell state compositions, 
differential gene expression, cell-cell interactions, and machine learning illuminated the shared and 
distinct transcriptional signatures resulting from pathogenic variants in DCM and ACM. 
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ing analyses on multiple samples with path- 
ogenic variants in DCM genes (LMNA, RBM20, 
and TIN), ACM genes (PKP2), or pathogenic 
variant-negative (PV negative) samples, we 
characterized genotype-stratified and common 
heart failure responses. 


RESULTS: From 881,081 nuclei isolated from 
left and right diseased and healthy ventricles, 
we identified 10 major cell types and 71 distinct 
transcriptional states. DCM and ACM tissues 
showed significant depletion of cardiomyocytes 
and increased endothelial and immune cells. 
Fibrosis was expanded in disease hearts, but, 
unexpectedly, fibroblasts were not increased, 
and instead showed altered transcriptional 
states that indicated activated remodeling of 
the extracellular matrix. Genotype-stratified 
analyses identified multiple transcriptional 
changes shared only among the hearts har- 
boring pathogenic variants or distinctive for 
individual and subsets of DCM and ACM geno- 
types. We validated many of these by single- 
molecule fluorescent in situ hybridization. 

Through analyses of receptor and ligand ex- 
pression across all cells, we observed changes 
in intercellular signaling and communica- 
tions, such as increased endothelin signaling 
in LMNA hearts, tumor necrosis factor in PKP2 
hearts, and others. We also identified specific 
cardiac cell lineages expressing genes with 
common polymorphisms that were identified 
in validated association studies of DCM. 

Because our findings indicated genotype- 
enriched transcripts and cell states, we har- 
nessed machine learning to develop a graph 
attention network for the multinomial clas- 
sification of genotypes. This network showed 
remarkably high prediction of the genotypes 
for each cardiac sample, thereby reinforcing 
our conclusion that genotypes activate very 
specific heart failure pathways. 


CONCLUSION: snRNAseq of human ventricu- 
lar samples illuminated cell types and states, 
molecular signals, and intercellular commu- 
nications that characterize DCM and ACM. 
The cellular and molecular architectures that 
induce heart failure are both shared and dis- 
tinct across genotypes. These data provide 
candidate therapeutic targets for future re- 
search and interventional opportunities to 
improve and personalize treatments for cardio- 
myopathies and heart failure. 
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Pathogenic variants damage cell composition and 
single cell transcription in cardiomyopathies 
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Pathogenic variants in genes that cause dilated cardiomyopathy (DCM) and arrhythmogenic cardiomyopathy 
(ACM) convey high risks for the development of heart failure through unknown mechanisms. Using 
single-nucleus RNA sequencing, we characterized the transcriptome of 880,000 nuclei from 18 control and 
61 failing, nonischemic human hearts with pathogenic variants in DCM and ACM genes or idiopathic 
disease. We performed genotype-stratified analyses of the ventricular cell lineages and transcriptional 
states. The resultant DCM and ACM ventricular cell atlas demonstrated distinct right and left ventricular 
responses, highlighting genotype-associated pathways, intercellular interactions, and differential gene 
expression at single-cell resolution. Together, these data illuminate both shared and distinct cellular and 
molecular architectures of human heart failure and suggest candidate therapeutic targets. 


ilated cardiomyopathy (DCM), a prev- 
alent disorder occurring in one in 
250 individuals, is characterized by left 
ventricular (LV) dilatation, cardiomyo- 
cyte loss with fibrotic replacement, and 
impaired contractility (7). Arrhythmogenic 
cardiomyopathy (ACM) similarly incites ven- 
tricular dysfunction, often with more promi- 
nent right ventricular (RV) involvement, high 
arrhythmogenic burden, and fibrofatty accu- 
mulations (2). Both disorders can arise from 
genetic causes (7, 2). DCM genes encode pro- 
teins involved in contractility [titin (TTN), 


6,18 


t2> 


troponin T (TNNT2), troponin C (TNNCI), 
tropomyosin (7PM1), and filamin C (FLNC)] 
that regulate cardiac splicing [RNA-binding 
motif protein (RBM20)] or calcium seques- 
tration [phospholamban (PLN)] and main- 
tain cytoskeletal [desmin (DES)] or nuclear 
[lamin A/C (LMNA)] integrity. ACM genes 
often encode desmosome proteins, includ- 
ing plakophilin-2 (PKP2) and desmoplakin 
(DSP). The cardiomyocyte-specific expression 
and damaging effects of pathogenic variants 
(PVs) in many DCM and ACM genes propel 
the development of arrhythmias and heart 


failure, a highly morbid condition affecting 
23 million individuals worldwide (3). 

We hypothesized that PVs in different genes 
would evoke distinct single-cell molecular phe- 
notypes. To address this, we studied the 
molecular signals underlying heart failure 
pathogenesis using single-nucleus RNA se- 
quencing (snRNAseq) of human hearts with 
advanced DCM and ACM compared with non- 
failing donor (control) hearts. We revealed 
differences in the cellular landscape and tran- 
scriptional changes between several DCM and 
ACM genotypes. By leveraging machine-learning 
approaches, we illuminated genotype-specific 
molecular responses, as validated by recon- 
structing the underlying PVs using snRNA- 
seq data. 


Results 
Study cohort 


We studied LV and RV tissues (Fig. 1A) obtained 
before any mechanical support in 61 patients 
and 18 controls (tables S1 and S2), including 
12 controls previously reported (4). Thirty- 
eight independent PVs were identified in 
three DCM genes (LMNA, n = 12; RMB20, n = 8; 
TIN, n = 12) and in the ACM gene PKP2 (n = 6), 
whereas no PVs were detected in eight DCM 
patients (PVneg) (Fig. 1B). Analyses were per- 
formed for these five genotypes individually 
(n = 46), for aggregated DCM genotypes 
(LMNA, RBM20, TTN, and PVneg), ACM (PKP2), 
and controls. Additionally, we generated data 
from 15 PVs across PLN, BAG3, DES, FLNC, 
FKTN, TNNCI, TNNT2, TMP1, and DSP (tables 
Sl and S82). Because there were few recurrent 
PVs, these genes were excluded in downstream 
analyses except where indicated. 

Males predominated among patients (60%) 
and controls (72%) (fig. SIA). The mean age 
of patients was 48.4 + 4.3 years exclusive of 
RBM20 (mean age 32.9 + 14.6 years). Clinical 
manifestations indicated similar LV dysfunc- 
tion in LMNA, TTN, and PVneg patients, but 
greater LV dilatation and reduced systolic 
contraction in RBM20 and preserved LV with 
reduced RV function in PKP2 patients. LMNA 
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Fig. 1. PVs and unexplained causes of DCM and ACM alter cardiac morphology, 
histopathology, and cellular compositions. (A) Comparisons of normal cardiac 
anatomy and histology with DCM, demonstrated LV dilation with fibrosis, and 
ACM-revealed RV dilation with fibrofatty degeneration (Masson trichrome staining; 
scale bar, 10 um). (B) Schematic depiction of the functions of DCM and ACM genes 
with PVs (number indicates unique genotypes, bold denotes six or more patients) 
in studied tissues. (€) Single nuclei isolated from transmural LV (free wall, apex, 
or septum) and RV sections were processed using 10X Chromium 3' chemistry. 


and TTN patients received more pacemaker 
and/or resynchronization therapies than those 
with other genotypes (table S1). 


Disease-associated compositional changes of 
cell types 


Nuclei were isolated as described previously 
(4) from full thickness LV free wall, apex, sep- 
tum, and RV free wall (Fig. 1C). We compared 
~500,000 high-quality nuclei from patients’ 
ventricular tissues and ~380,000 nuclei from 
controls (fig. S1, B to E). After preprocessing 
and quality control filtering, data were inte- 
grated across samples using Harmony before 
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constructing manifolds using Uniform Mani- 
fold Approximation and Projections (UMAPs). 
Clustering identified 10 major cardiac cell 
types encompassing ventricular cardiomyo- 
cytes (CMs), fibroblasts (FBs), adipocytes (ADs), 
pericytes and smooth muscle cells (mural, 
MCs), endothelial cells (ECs), myeloid and 
lymphoid (immune cells), neuronal (NCs), and 
mast cells (Fig. 1C) with 71 distinct cell states. 
States of the same cell type shared a transcrip- 
tional profile but also expressed distinct genes, 
which implied biological differences. 

Cell type abundances in sample replicates 
were highly correlated (Pearson coefficient 
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UMAP embedding of 881,081 nuclei delineated 10 cell types and unassigned 
populations (gray). (D) Upper panel: Mean abundance (%) of cell types in control 
LVs. Lower panel: Proportional changes of cell types in specified genotypes or 
aggregated across DCM genotypes. Proportional changes are scaled by color: 

Red indicates an increase and blue a decrease in disease versus control. P values are 
indicated for significant proportional changes (FDR < 0.05). (E) Pairwise cell type 
abundance ratios in specified genotypes or aggregated DCM genotypes in LVs 
relative to controls. Color scale, FDR, and significance are as in (D). 


0.74 to 0.99) (fig. S2, A and B). Cell compo- 
sition, states, and transcript counts across 
the free wall, apex, and septum showed high 
similarities, and therefore are reported grouped 
together as LV (fig. S2, C and D). 

Using center log ratio-transformed abun- 
dance of cell types, we considered the effects of 
sex on LV and RV cell compositions in DCM 
(n = 10 female, n = 29 male) and control (n = 7 
female, m = 11 male) hearts (fig. S2E). Myeloid 
cells showed a modest sex-specific difference 
[false discovery rate (FDR) = 0.016]. Only LMNA 
tissues (from 7 = 7 males and n = 5 females) were 
sufficient for genotype-specific, sex-associated 
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analyses. Female LV myeloid cells were in- 
creased (FDR = 0.05), paralleling the findings 
in DCM versus controls. In addition, RV endo- 
thelial cells showed a modest female-specific 
increase between LMNA patients and con- 
trols (FDR = 0.048). 

The proportions of LV nuclei across the 
genotypes demonstrated depletion of CMs 
except in LMNA, and increased ECs and im- 
mune cells except in PKP2 (Fig. 1D and fig. S3). 
In RVs (fig. S4), CMs were depleted in the DCM 
subgroup except for TTN, whereas ECs were 
increased in LMNA, TTN, and RBM20, and 
immune cells were not changed. FBs were not 
increased in LVs and RVs (Fig. 1D) despite 
histopathological findings of fibrosis, which 
implied the acquisition of a secretory rather 
than a proliferative phenotype. 

Individual-level abundances for cell types 
(LV, table S3; RV, table S4) and cell states 
(table S5) are provided for controls and pa- 
tients. Different cell type abundances in dis- 
ease compared with control LVs remained 
significant using a linear model adjusting for 
age. Pairwise cell type ratios in disease LVs 
compared with controls confirmed loss of 
CMs and showed accompanying increased 
ECs, lymphoid and myeloid cells, and altered 
ratios (FB and mural cells compared with CMs) 
with quantitative, genotype-specific differences 
(Fig. 1E). RBM20 and PVneg, respectively, 
showed greatest shifts in the EC:CM (8- and 
10.3-fold), myeloid:CM (9.8- and 14.6-fold), 
and lymphoid:CM (12.4- and 15.2-fold) ratios. 
By contrast, pairwise ratios of all cell types com- 
pared with CMs were modest or unchanged 
in LMNA. 


Genotypes diversify cell types and states 
Cardiomyocytes 


Disease and control ventricles exhibited the 
previously described cardiomyocyte states, 
vCM1.0 to vCM5 (4), and four new states, 
vCM1.1, vCM1.2, vCM1.3, and vCM3.1 (Fig. 2, 
A and B, and table S6). Across all vCM cell 
states (fig. S5), differentially expressed genes 
(DEGs) in disease connoted increased stress 
(e.g., NPPB) and apoptosis (tables $7 to S12 
and fig. S6A). Although these findings implied 
late-stage transcriptional convergence, 20 to 
40% of DEGs were genotype specific (fig. S5). 
Only PVneg reduced MYH6 expression, in- 
creasing MYH7:MYHE6 ratios (fig. S6B). Con- 
versely, only vCMs with PVs down-regulated 
SMYDI (Fig. 2C; fig. S6, C to E; table S13), 
a cardioprotective muscle-specific histone 
methyltransferase that regulates sarcomere 
assembly and mitochondrial energetics (5, 6), 
and ADRBI (f1-adrenergic receptor; fig. S6C), 
which is indicative of adrenergic activation 
and is therapeutically targeted by B-blockade 
medicines (7, 8). Genotype-selective responses 
included up-regulation of FNIP2 (Fig. 2D), in- 
hibiting AMP-activated protein kinase activity 
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Fig. 2. Cardiomyocytes and fibroblast 
states in control, DCM, and ACM ventricles. 
(A) UMAP depicting CM states in all tissues. 
(B) Control and disease LV and RV abun- 
dance analyses for vCM1.1 (upper panel) and 
vCM1.2 (lower panel). (C) Single-molecu 
RNA fluorescent in situ hybridization 
exemplifies decreased SMYDI1 (red) expres- 
sion in CMs (identified by TNNT2 transcripts, 
cyan) within a DCM heart with a PV in PLN 
(phospholamban). Cell boundaries are 
stained with WGA (green); nuclei are stained 
with DAPI (blue). Scale bar, 10 um. Quantified 
expression (spots per CM) and p-values 
from four independent control and disease LVs 
with PVs were assessed. (D) Immunohisto- 
chemistry validated decreased SMYD1 (red) 
protein in CMs (identified by troponin 

T staining, fig. S6E) in TTN LV section. Cell 
boundaries are stained with wheat germ 
agglutinin (WGA; green); nuclei are stained 
with 4',6-diamidino-2-phenylindole (DAPI; 
blue). Scale bar, 10 um. Quantified protein 
levels (intensity per CM) and P values 

were assessed from five independent control 
and DCM LVs with PVs. (E) Single-molecule 
RNA fluorescent in situ hybridization 
demonstrated increased expression of FNIP2 
(red). CMs, nuclei, and cell boundaries are 
labeled as in (C). Scale bar, 10 wm. Quantified 
expression of FNIP2 (spots per CM and 
H-score) and P values reflect analyses 

of two independent control and PKP2 
samples. (F) UMAP depicting FB states. 

(G) Hydroxyproline assay (HPA) quantified 
cardiac collagen content for each genotype. 
(H) Control and diseased LV and RV 
abundance analyses for vFB2 (upper panel) 
and vFB3 (lower panel). (1) Pathway score 
of TGF activation in LV vFB2. (J) Single- 
molecule RNA fluorescent in situ hybridiza- 
tion showing decreased expression of CCL2 
(red) in vFB3 (DCN, cyan) in disease 
compared with controls. Cell boundaries 

are stained with WGA (green); nuclei are 
stained with DAPI (blue). Scale bar, 10 um. Dot 
plot illustrating the log2 fold change (log2FC) 
and significance [-loglO(FDR)] of CCL2 
expression in LV vFB3 across genotypes. 


Oo 


and oxidative metabolism (9-11) in LMNA and 
PKP2, and CPEB4, an RNA-binding protein 
that activates glycolysis and stimulates fibro- 
sis (12) in LMNA, RBM20, and PKP2 (fig. S6F). 

Proportional differences in vCM states (figs. 
S6G, S7, and S8) varied among genotypes and 
controls and were more prominent in RVs 
than LVs, perhaps reflecting hemodynamic 
differences. PVs generally increased vCM1.1 
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and vCM1.2, with accompanying decreased 
vCM3.1 in LMNA and TTN (Fig. 2B and figs. 
S7A and S8, A to D). State-enriched DEGs for 
vCM1.1 included MYOI8B, which is required 
for sarcomere formation (13); XPR1, which 
regulates phosphate homeostasis (74); and 
IGFIR and RORI, which are involved in cell 
survival (8). vCM1.2 had heightened expres- 
sion of electrophysiologic genes including 
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PCDH7, which enables intercellular contacts 
(15, 16) but when overexpressed impedes 
synaptic currents (17); CAMK2D, which mod- 
ulates excitation-contraction coupling (8); 
and ANK2, which harbors arrhythmogenic 
PVs (19, 20). VCM1.3 was modestly increased in 
RBM 20, associated with the largest numbers 
of DEGs in all genotypes, and enriched for 
BMP receptors [BMPRIA, BMPRIB (21,22); 
fig. S7, B and D) and GPC5, which regulates 
BMP signaling (23). vCM3.1, which was reduced 
in LMNA and TTN, expressed early-adaptive 
transcriptional regulators of stress responses 
(ATF3 and ATF4) and cardioprotection (VR4A3; 
fig. S7E) (24). Only PVneg depleted vCM2 and 
was enriched for SH3RF2, an anti-apoptosis 
regulator of the JNK pathway (25). Because 
DCM and ACM genes are highly and often 
selectively expressed in CMs, these disease- 
associated vCM states and DEGs defined in- 
trinsic responses to PVs as well as cell-cell 
responses to CM death, and depressed con- 
tractile performance and arrhythmogenicity. 


Fibroblasts 


We identified four previously characterized 
FB states (4) and two new ones, vFB1.1 and 
vFB1.2 (Fig. 2E and table S14). Across all vFB 
states, up-regulated DEGs included genes 
involved in extracellular matrix (ECM) re- 
modeling (figs. S9 to S11 and tables S15 to 
$21). LMNA, TTN, and PKP2 increased fibro- 
genic signaling receptor, and EGFR and PKP2 
also increased AGTRI (fig. SIOA), which en- 
ables EGFR transactivation (26, 27). The ex- 
pression of profibrotic TGFB2 was universally 
increased (fig. S1IOB). DCM hearts were en- 
riched for PCOLCE2, promoting insoluble 
collagen formation (28), and LMNA, TTN, 
and PVneg down-regulated metalloproteinase 
inhibitors TIMP] and TIMP3 (fig. S10C). Col- 
lagen genes showed genotype-specific expres- 
sion. COL4A1 and COL4A2 were up-regulated 
in LMNA, TTN, and PKP2, whereas COL4A5 
and COL24A/1 were enriched in PVneg. Thus, 
although ECM up-regulation and collagen 
deposition (Fig. 2F and fig. S10, D to F) were 
shared features, genotype and chamber influ- 
enced composition and organization. 

FB state abundance was also altered across 
genotypes (figs. SI1A and S12, A to D). vFB1.1 
and vFB1.2 expressed canonical vFB1.0 genes, 
but vFB1.1 also up-regulated APOD, APOE, and 
APOO, which typify lipogenic fibroblasts (29), 
and CST3 (fig. S13A), which is involved in 
matrix remodeling. vFB1.2-enriched genes 
are related to actin filament assembly (DAAM7) 
and chondrocyte differentiation (GPC6) (fig. 
S13B). vVFB2 expressed prominent profibrotic 
genes, including 7GFf targets (increased in 
DCM LVs) and fibrogenic JZ1/ (30) (highest in 
RBM20 LVs) (Fig. 2H and fig. S14, A to C). VFB2 
was increased in T7N and PVneg and modestly 
in other DCM LVs (Fig. 2, G and H, and fig. S11). 
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vFB3, which was diminished in hearts with 
PVs, expressed proinflammatory cytokine genes 
(CCL2) and genes related to OSM signaling 
UL6ST, OSMR, and oncostatin M-target genes) 
(Fig. 21 and fig. S14D). The resultant increased 
ratio of VFB2:vFB3 and altered DEGs suggested 
dysregulation of fibroblast-to-macrophage in- 
teractions that would promote deleterious 
ECM remodeling, particularly in T7N LV and 
RV (fig. S12, C and D). 


Smooth muscle cells and pericytes 


Three previously described pericyte cell states 
(PC1, PC2, and PC3) and three smooth muscle 
cell states (SMC1.1, SMC1.2, and SMC2; Fig. 3A 
and table S22) were identified. DEGs across all 
states (fig. S15 and tables S23 to S28) showed 
up-regulation of the sodium channel SCN3A, 
with unknown vascular functions, and the 
noncoding antisense ADAMTS9-AS2, with 
concordant down-regulation of ADAMTS9, a 
metalloprotease involved in ECM remodeling 
(31, 32) (Fig. 3B and fig. S16A). Overall, the 
genes with dysregulated expression indicated 
that disease evoked signals to synthesize spe- 
cific ECM and integrin components. 

DEGs in PC states included down-regulation 
of two central signaling receptors, NOTCH3 
and PDGFRB, in PKP2 and TTN PC! (Fig. 3B). 
NOTCHS3 is required for SMC maturation and 
deficiency causes pericyte dysfunction and 
arteriovenous malformations. Notch signal- 
ing regulates PDGFRB, which is necessary for 
angiogenesis and PC recruitment (33-37). 

DEGs in SMCs subdivided the previously 
identified canonical SMC1 (4) into two states: 
SMCL1, which had higher expression of ACTA2, 
MYH1I, and TAGLN, and SMC1.2, which strong- 
ly expressed JTGA8, required for maintenance 
of SMC contractile phenotype (38, 39), and 
ATPIOA, suggesting vascular stiffness and in- 
creasing diastolic dysfunction (Fig. 3B). Meth- 
ylation of the ATPIOA locus in SMC decreases 
with age and atherosclerosis (40). SMC1.2 was 
enriched in LMNA, TTN, and PVneg RVs (figs. 
S17 and S18). SMC2 expressed high levels of 
genes involved in collagen and elastic fiber 
formation (ELN and LAMA2) and MYHI10, 
demarcating dedifferentiated SMCs with se- 
cretory properties (figs. SI6B and S17C). LMNA 
and PKP2 SMC2 up-regulated SLIT3, a stimu- 
lator of fibroblast activity (47), and ECM syn- 
thesis and collagen formation genes (Fig. 3, B 
and C). Collectively, disease remodeling of MC 
indicated modulation of PDGF and NOTCH 
signaling receptors and synthesis of selective 
ECM and integrin components. 


Endothelial cells 


Characterization of ECs identified seven pre- 
viously described cell states (4): EC1, EC2, 
EC5 to EC8 and mesothelial cells) (Fig. 3D, 
figs. S19 to $21, and table S29). Shared DEGs 
occurred across all and between genotypes 
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(tables S30 to S35), with clear RV and LV dif- 
ferences (fig. S19). Dysregulation of genes 
encoding factors involved in EC fate, blood ves- 
sel development, and angiogenesis (VOTCH4, 
FLTI, FGFR1, and RGCC) (42-44) in disease 
LVs indicated that vascular remodeling was 
a common heart failure feature. 

EC composition in DCM RVs and LVs in- 
creased EC5 compared with controls (fig. S21A), 
whereas genotype-specific cell state ratios in 
LVs generally decreased EC1 relative to EC2, 
EC5, and EC6 (Fig. 3F and fig. S21C). Assess- 
ment of selective gene enrichment scores for 
proliferating cells, endothelial-to-mesenchymal 
transformation, and cell death showed no dif- 
ference in disease compared with controls (fig. 
$22 and table S36). 

EC7 had the most DEGs and the highest 
proportion of unique DEGs across all geno- 
types. Although initially defined as being 
atrial-enriched (4), EC7 expressed endocardial- 
enriched genes [SMOCI (45), NPR3 (46), and 
POSTN (47)] (fig. S20, B and C), which prompted 
a revised annotation to endocardial cells. Up- 
regulated genes in EC7 from DCM LVs and 
PKP2 RVs (Fig. 3G) encoded secreted pro- 
teins involved in myocardial stress adaptation 
(NRG1) (48), CM force production (EDN1) 
(49, 50), and endocardial expansion during 
development or after cardiac injury (BMP6) 
(51,52). BMP6 was selectively up-regulated 
in the endocardium of DCM LVs and ACM 
RVs (Fig. 3E). Conversely, IVHBA, a TGF-B 
superfamily member involved in atrioventric- 
ular canal development (53), and cell adhe- 
sion molecule OPCML were down-regulated. 
In PKP2, an unconventional myosin promoting 
cell adhesion (MYOI10), and POSTN were up- 
regulated in both ventricles. These data high- 
lighted the involvement of the endocardium in 
chamber-specific remodeling of cardiomyopathies. 


Myeloid cells 


We classified 14 subclusters of myeloid cells 
comprising macrophages (MPs), monocytes 
(MOs), conventional dendritic cells (cDC1 and 
cDC2), and proliferating myeloid cells (Fig. 4A, 
figs. S23 to S27, and tables S37 to S45). Distinct 
gene sets were unique to each genotype and 
were particularly pronounced in PVneg and 
PKP2 in LVs (fig. $24). 

Although disease increased myeloid cells, the 
proportions of proliferating myeloids (Fig. 4, B 
and C, and figs. S26 and S27) were consistently 
lower compared with controls, implying mono- 
cyte infiltration. The tissue-resident MPs 
LYVET™®>MHCII™ and LYVEI°’MHCII"2" 
(fig. S25C) were the most abundant myeloid 
cells (fig. $27, A and B). MP LYVET°*MHcIr* 
were increased in the RV of LMNA, with sim- 
ilar trends in other genotypes (fig. S27B). Pro- 
portions of MP°™ were modestly decreased 
with down-regulation of OSM in TTN ven- 
tricles (fig. S28A), which would attenuate the 
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Fig. 3. Mural and endothelial cell states in control, DCM, and ACM hearts. 
(A) UMAP depicting PC and SMC states in all tissues. (B) Dot plots illustrating 
the levels (fold-change; logFC) and significance [-logl0(FDR)] of selected DEGs in 
LV PC1, SMC1.2, SMC2, and MC (PC and SMC) across genotypes. (©) KEGG 
pathway analysis of DEGs in SMC2 among genotypes with one or more enriched 
pathways. Color intensity denotes enrichment significance [-loglO(FDR)]. 

(D) UMAP depicting EC states in all tissues. (E) Single-molecule RNA fluorescent 
n the RV endocardium 


in situ hybridization identified BMP6 (red) expression i 


MpP°™.vFB3 signaling axis (4), and mirrored 
the decreased OSM pathway activation score 
observed in vFB3 (fig. S14B). 

Prominent antigen-presenting activities were 
evident in MOY“N, MO!®, cDCI, and cDC2, 
and also occurred in MP*O"®?, akin to tumor- 
associated MPs (54). Across antigen-presenting 
MPs, RBM20 LVs showed the highest pre- 
sentation of antigens based on MHCII genes 
(fig. S28, B and C) and more abundant cDC2 
compared with the other genotypes (fig. S27). 
PKP2 LVs up-regulated MPS (fig. $27) with 
interferon-stimulated genes (55), perhaps 
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contributing to inflammatory PKP2 pheno- 
types (56). 


Lymphoids 


We classified 15 lymphoid cell states, includ- 
ing T and natural killer (NK) cell subsets, 
innate lymphoid cells (ILCs), B cells, plasma 
cells, and proliferating lymphoids (Fig. 4D, 
figs. S29 to S32, and table S46). Our exper- 
imental design did not enrich for immune 
cells, and few (<40) genotype-specific DEGs 
were identified (fig. S29). Proliferating lym- 
phoids were rare, and their abundance was 
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from a heart with a DSP (desmoplakin) PV compared with control RV. Cell 
boundaries are stained with WGA (green); nuclei are stained with DAPI (blue). 
Scale bar, 10 um. (F) Pairwise cell state abundance ratios in DCM LVs relative to 
controls. Proportional changes are scaled by color: Red indicates an increase and 
blue a decrease in disease versus control. P values are indicated for significant 
proportional changes (FDR < 0.05). (G) Dot plots illustrating LV and RV levels 
(fold-change; logFC) and significance [-logl10(FDR)] of selected DEGs in EC7 
across genotypes. Dot size and color are as defined in (B). 


unchanged in disease (figs. S30 and 32, A to 
D, and tables S47 to S52). 

The cardiac complexity of CD4* and CD8* 
T cells included naive (CD4T™"), activated 
(CD4T*"), regulatory (CD4T"®), cytotoxic 
(CD8T***), transitional (CD8T"*"), termi- 
nal effector (CD8T"*), and effector memory 
(CD8T™) cells (fig. S31B) (57). We detected 
increased CD4T*“ only within DCM samples 
(fig. S32, A to D). However, DEGs indicated 
lymphocyte activation (cytokines JFNG, CCL3, 
and CCL4 and signaling molecules CBLB, FYN, 
and TXNIP), and maturation (cell surface 
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Fig. 4. Immune cell states in control, DCM, and ACM hearts. (A) UMAP 
depicting myeloid states in all tissues. Unclassified MP1, MP2, and MP3 require 
future characterization. Gray boxes enclosing proliferating (prolif) MPs, unclassified 
MPs, and cDCls indicate that these were manually moved toward other states 

for ease of representation. The unmodified UMAP is in fig. S23. (B) Myeloid 
proliferation had higher abundance (% total myeloids) in controls versus disease. 
P values indicate significant differences in abundances. (C) Single-molecule RNA 
fluorescent in situ hybridization validated increased expression of TOPZA (red) and 


C1QA (white) in cont 
(green); nuclei are st 
lymphoid cell states in all tissues. Unclassified LY1 and LY2 require future 


characterization. (E) 


ols versus disease. Cell boundaries are stained with WGA 
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Dot plots showing the level of fold-change (logFC) and 


significance [-logl0(FDR)] of selected genes in LV NK CD16", LV and RV CD4T**t, 
and RV CD8T"@"5 across genotypes. (F) Dot plots highlighting gene expression 
of the Thl, Th2, and Th17 signatures in CD4T*“ cells. Dot size indicates the fraction 


(%) of expressing ce 


ls; color indicates the mean expression. 


receptors CD69 and CXCR4), particularly in 
PKP2 NK cells, CD4** T cells, and CD8* T cells 
(Fig. 4E). 

CD4* helper T cells are critical drivers in the 
pathogenesis of cardiomyopathy and myo- 
carditis (58, 59). In LMNA CD4T*“ cells, we 
observed up-regulation of TBX2/, which is 
important for Th1 polarization. Conversely, the 
Th2-polarization transcription factor GATA3 
was down-regulated in PVneg (Fig. 4F). 


Neuronal cells 


Analysis of cardiac NCs was limited by rarity 
of this cell type (Fig. 5A, fig. S33, and tables S53 
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to S57). Across all NC states, DEGs indicated 
increased NFEATC2 in LVs with PVs and PKP2 
RVs, and genotype-selective enrichment of 
LRRK2, an activator of a neurotoxic cascade 
(60). Other up-regulated DEGs function in 
proteoglycan synthesis for neuronal mye- 
lination and axon regeneration (XYLTI and 
HAS3ST4), and a complement inhibitor (SUSD4) 
that affects neural function and morphology 
(61, 62). Genotype-specific DEGs were highest 
in PKP2. 

We identified previously described NC states 
(4) and three new ones (Fig. 5, B and C, and 
figs. S34 and S35) that were genotype and 
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chamber selective: NC1.1 in LMNA and TTN 
LVs; NC1.2 in TTN and RBM20 LVs and in 
LMNA, TTN, and PKP2 RVs. NC1.1 was dis- 
tinguished by the highest up-regulation of 
NFATC2. NC1.2 up-regulated genes associated 
with electrocardiogram intervals (SZC35F1 and 
AJAPI; Fig. 5D), IGFBP®5, involved in neuronal 
apoptosis and autophagy, and the ion channel 
and heart rate regulator KCNKI (63-65). NC1.3 
was enriched for the neuromodulator receptor 
GALRI and phosphodiesterases PDEIOA and 
PDE3B, which participate in neuroprotection 
and signaling (66-68). Dysregulated expres- 
sion of genes involved in stress responses and 
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Fig. 5. Neuronal and adipocyte cell states in control, DCM, and ACM hearts. 
(A) UMAP depicting NC states in all tissues. (B) Dot plot highlighting the top marker 
genes for NC states. (C) LV and RV abundance analyses of NC1.1 and NC1.2 in 
controls versus disease. P values are indicated for significant proportional changes 
(FDR < 0.05). (D) Single-molecule RNA fluorescent in situ hybridization showing 
colocalization of AJAPI (red) and NRXNI (cyan) in disease as exemplified in a 
DCM LV with a PV in PLN (phospholamban) demarcating the NC1.2 state. Cell 
boundaries are stained with WGA (green); nuclei are stained with DAPI (blue). 


Scale bar, 10 ym. (E) UMAP depicting AD states in all tissues. (F) Dot plot 
highlighting the top marker genes for AD states. (G@) LV and RV abundance 
analyses demonstrated decreased AD1.0 and increased AD1.1 in disease versus 
control. (H) Dot plot of DEGs showing expression differences between AD states. 
(I) Heatmap of significantly enriched Gene Ontology Biological Processes 

terms based on up-regulated genes in disease versus control ADs. Dot size 
indicates the fraction (%) of expressing cells; color indicates the mean expression 
level. P values indicate significant differences (FDR < 0.05). 


electrophysiology may account for character- 
istic life-threatening arrhythmias in DCM and 
PVs in PKP2 (1, 2). 


Adipocytes 


Similar to NCs, limited adipocytes were cap- 
tured. We identified three adipocyte states 
previously found in controls: canonical AD1.0, 
expressing lipid metabolism genes; stromal 
AD2, expressing ECM genes; and immune 
AD3, expressing OSMR and cytokine-responsive 
genes (Fig. 5, Eand F, fig. S36, and tables S60 
to S64). Although AD3 was not detected in 
disease, a fourth identified state, AD1.1, was 
almost exclusive to diseased hearts (Fig. 5G 
and fig. S37). Compositional analysis identi- 
fied increased proportions of AD1.1 in PKP2 
LVs and RVs concurrent with decreased LV 
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proportions of AD1.0. LMNA and RBM20 
RVs, unlike LVs, also increased proportions 
of AD1.1 (fig. $38, A to D). 

DEGs between AD1.1 and AD1.0 revealed 
changes in fatty acid metabolism pathways 
(Fig. 5H). AD1.1 showed down-regulation of 
DGAT2, encoding a triglyceride-forming en- 
zyme (69), and GOS2 and MGLL, encoding 
lipolysis regulators (70). Conversely ABHD5, 
a positive lipolysis regulator (70, 71); PDK4, a 
kinase promoting the shift from glucose to 
fatty acid metabolism; and C/DEA, a regulator 
of adipose tissue energy expenditure, were up- 
regulated. PKP2 RVs, which typically display 
fibrofatty replacement, also showed an enrich- 
ment of Gene Ontology biological processes for 
apoptosis and cell death (Fig. 51). These data 
implied genotype-specific state transitions or 
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replacement of canonical adipocytes in DCM 
and ACM. 


Differential expression of GWAS genes in 
cardiomyopathies 


Genome-wide association studies (GWAS) have 
identified common genetic variants associated 
with DCM. We selected candidate genes from 
15 previously identified DCM loci (table S67) 
and examined expression across cell types and 
genotypes (fig. S39). Overall, GWAS genes were 
more often DEGs in LV and RV than expected 
by chance in snRNAseq data [LV: odds ratio 
(OR) = 7.0, P = 0.0007; RV: OR = 6.1, P = 0.0009, 
one-sided Fisher’s exact test). Multiple genes 
showed cell-type-specific expressions, with 
the majority highly enriched in CM (ALPK3, 
BAG3, FHOD3, FLNC, HSPB7, MLIP, SLC6A6, 
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SMARCBI, SVIL, and TTN). Among these, we 
observed both genotype- and chamber-specific 
expression differences. The heat shock pro- 
tein HSBP7 was reduced in TTN LVs and in 
LMNA and PKP2 RVs. SLC6A6, a taurine and 
amino acid transporter with cardioprotec- 
tive effects, was increased in both LV and RV 
across all genotypes except for PKP2 and PVneg. 
CDKNIA, a cell cycle regulator and modula- 
tor of apoptosis (72), was increased in LMNA 
CMs from LVs but not RVs (72). 

GWAS genes that were more broadly ex- 
pressed across cell types included MTSSI, en- 
coding a putative actin-cytoskeletal interactor. 
MTSS1 showed highest and unchanged ex- 
pression in myeloids and was widely increased 
in mural cells as well as in FBs in LMNA, 
TTN, and PKP2 (fig. S39), suggesting influences 
beyond direct effects on CM function (73, 74). 


SLC39A8, a lowly expressed cardiac solute 
carrier, was unchanged in CMs but increased 
in LMNA and TTN LV ECs. We suggest that 
cell-specific expression changes of GWAS genes 
may improve interpretation of their biologic 
effects. 


Predicted and altered cell-cell interactions 
across genotypes 


By examining the expression of genes encod- 
ing for receptors and ligands, we inferred 
intercellular signaling and communication 
(75). We initially quantified the probability of 
cell-cell interactions and compared signaling 
between cell states, and then aggregated in- 
formation to produce cell-specific and across- 
all-cell-types data for each genotype relative to 
controls. This sequential approach accounted 
for differential abundances of cell states. 


We detected aberrant intercellular signaling 
across disease (Fig. 6A and fig. S40), including 
up-regulation of the BMP, FNI1, collagen, EGF, 
IGF, and TGF pathways that promote fibrosis. 
Signaling dependent on VEGF, NOTCH, and 
ANGPT was also increased in disease, imply- 
ing vascular remodeling. Genotype-selective 
increases in intercellular signaling pathways 
were also identified in LVs (Fig. 6B and fig. S40, 
Aand B): EDN in LMA, the proinflammatory 
IL6 in TTN, BAFF/LIGHT (denoting TNF sig- 
naling) in RBM20, pro-inflammatory CCL and 
TNF in PKP2, and the immune modulator 
BTLA in PVneg. Some of these intercellular 
signaling pathways were similarly dysregulated 
in RV, but chamber-specific changes were also 
observed (fig. S40, C and D, and fig. S41). 

We also identified genotype-specific differ- 
ences in the cells sending and receiving signals 
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Fig. 6. Altered cell-cell interactions and recognition of genotype-specific 
transcriptional responses. Heatmaps depict shared (A) and unique (B) signaling 
pathways in LVs, with significantly different expression in genotypes compared 
with controls. Signaling pathways are defined in the CellChat database (75). 
Changes in interaction strength (log2FC; table S68), scaled by color intensity 
(red, increased; blue, decreased). Asterisks denote statistical significance 
(adjusted P value < 0.05); dashes indicate that expression was not detected in 
control or disease. (C) Circle plots of significant (adjusted P value < 0.05) 
cell-cell communication depicting differentially regulated IGF, BMP, and NRG 
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pathways and interactions in disease LVs. The line thickness denotes interaction 
strength of signals from sending and receiving cells; color is scaled from 
zero to maximum in disease versus controls (orange, increased; blue, 
decreased). Arrows indicate directionality. (D) Top: Genotype prediction 
probability from GAT per cell type. Bottom: Stacked bar plots representing 
the likelihood (% aggregated probability) of individual patient genotypes 

by GAT prediction. Most established genotypes were predicted with high 
probability, with lower prediction probability only in H1O (PKP2), H20 (RBM20), 
H22 (RBM20), and H33 (RBM20). 
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in disease pathways (Fig. 6C). For example, IGF 
signaling (fig. S42), which is crucial in cell 
growth and CM hypertrophy, showed increased 
FB autocrine (highest in PKP2) and paracrine 
FB to CM signaling, paralleling findings in 
experimental models (76, 77). In addition, 
IGF signaling from myeloid cells to CMs only 
occurred in TTN, PKP2, and PVneg, an inter- 
action that might promote muscle repair (78). 

The source of BMP signaling changed in a 
genotype-specific manner (Fig. 6C and fig. S43). 
BMP signaling from MC to CM was increased 
in TTN LVs and RVs, but was down-regulated 
in PVneg and LMNA LVs and other RVs (figs. 
S41 and S43). BMP signaling in ECs originated 
solely from EC7, likely depending on BMP6 
up-regulation (Fig. 3, E and G). EDN signaling 
from EC7 to CM and MC was highly genotype 
selective, occurring in LMNA LV and PKP2 RV 
(fig. S44). 

NRG signaling (comprising NRGI-3 and 
ERBB receptors) showed multifaceted changes. 
Disease LVs markedly attenuated autocrine 
NRG signaling in CMs, whereas EC and CM 
signaling was up-regulated (highest in LMNA) 
in all genotypes except RBM20 (Fig. 6C and 
fig. S45). Additionally, NRG3-ERBB4 inter- 
actions identified in controls shifted in dis- 
ease to NRGI-ERBB4 and NRG1-ERBB3 in a 
genotype-specific manner, consistent with 
changes in NRGI/3 expression in EC7 (Fig. 3G 
and fig. S46) (79). This predicted NRG/ERBB 
shift may provide compensatory responses to 
adverse remodeling in cardiomyopathies (48). 


Graph attention networks recognized 
genotype-specific expression patterns 


We applied machine-learning approaches to 
snRNAseq data to further advance the recog- 
nition of cell- and genotype-specific transcrip- 
tional patterns. Cell-specific neighborhood 
graphs showed more connectivities among 
single-nuclei transcriptomes from PVneg 
hearts and hearts with PVs in the same gene 
compared with PVs from different genes (fig. 
S47). Subsequently, we generated a graph 
attention network (GAT) for multinomial 
classification of genotypes trained on four 
major informative LV cell types: CMs, FBs, 
ECs, and myeloids (fig. S47, A and B). The GAT 
predicted LMNA, TTN, RBM20, PKP2, and 
PVneg genotypes with high accuracy. Among 
LV samples, the genotype prediction accu- 
racy differed by cell type: CMs, 0.93; FBs, 0.92; 
myeloids, 0.85; and ECs, 0.79 (Fig. 6D, fig. S48, 
and tables S69 to S70; corresponding RV data 
are shown in fig. S47C). Aggregation of geno- 
type predictions obtained from these four LV 
cell types strengthened the correct prediction 
of genotypes, resulting in a high confidence 
model (Fig. 6D). Three (H10, H22, and H33) of 
the four lower prediction probabilities occurred 
in samples with both a primary and secondary 
PV, assigned as such by prior genotype and 
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clinical phenotype review (table S1). More- 
over, because this machine-learning model 
independently confirmed a genotype- and cell- 
type-specific transcriptional signature, we con- 
cluded that these snRNAseq datasets accurately 
described the molecular responses to PVs and 
unexplained causes of DCM and ACM. 


Discussion 


Our analyses of snRNAseq of LV and RV sam- 
ples illuminated the cell types and states, 
molecular signals, and predicted intercellular 
communications that characterized DCM and 
ACM. Compared with control hearts, we iden- 
tified differences at multiple levels, including 
changes in the proportions of cell types and 
states, additional cell states, and differential 
gene expression, substantially expanding earlier 
insights achieved by bulk tissue analyses. Across 
all genotypes, diseased hearts demonstrated 
some common dissimilarities from control 
hearts, often with graded differences between 
LVs and RVs. Despite studying hearts from 
patients with advanced disease who received 
diverse therapies, congruent transcriptional 
signatures emerged for different PVs within 
the same gene and varied between genotypes. 

The differences between genotype groups 
and controls reflected differences in mean ex- 
pression and not differences in variance. This 
was true for each genotype group, including 
PVneg. Transcriptional signatures were com- 
plex, diversified both by the proportions of 
canonical and stressed cell states and by dif- 
ferentially expressed genes within the same 
states. Although interrogation of these datasets 
provides ongoing opportunities for discovery, 
our findings provided substantial evidence that 
genotype influenced pathological remodeling 
of the heart. These results upend a prevalent 
dogma that heart failure results from a com- 
mon final pathway and can guide the future 
development of therapies with selective tar- 
gets to enhance personalized medicine. 

Despite anatomical and histopathological 
differences between DCM and ACM, we iden- 
tified shared changes in the cellular compo- 
sition of ventricular tissues, albeit with DCM 
LV features largely mirrored in ACM RVs. 
Cardiomyopathies were depleted in CMs, 
whereas EC and immune cell populations 
were increased. FBs did not expand, but in- 
creased states that augmented ECM gene ex- 
pression and collagen deposition. Based on 
these changes and cytokine profiles (TGFB 
activation, increased IGF signaling, decreased 
CCL2 expression; Fig. 2, H and I, and 6C), we 
predicted cell-cell interactions and key mol- 
ecules that are appropriate for mechanistic 
studies to causally link differential expression 
with adverse cardiac remodeling. 

Disease CMs exhibited loss of the canonical 
state vCM1.0 and the emergence of genotype- 
enriched cell states with DEGs. Many of these 
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responses are associated with stress-induced 
contractile, metabolic, and electrophysiologic 
properties that are prominent clinical mani- 
festations of some genotypes. For example, 
attenuated expression of SMYDI, a critical 
organizer of sarcomere structure and epige- 
netic and metabolic remodeling, occurred only 
in CMs with PVs and was unaltered in PVneg 
samples. With its global impact on myocardial 
function, dysregulation of SMVYD1 might con- 
tribute to earlier presentation and poorer out- 
comes of cardiomyopathy patients with versus 
without PVs (80). LMNA CMs had the greatest 
expansion of vCM1.2, enriched for genes en- 
coding Ca?* regulators and molecules with 
electrophysiologic functions, whereas RMB20 
and LMNA genotypes had the highest expres- 
sion of MYL4, a sarcomere protein associated 
with atrial fibrillation. These data suggested 
molecular mechanisms whereby particular 
genotypes convey increased risks for arrhyth- 
mias and sudden cardiac death in patients. 

Shifts in FB states explained the paradoxi- 
cally increased fibrosis in cardiomyopathies, 
without expansion of overall FB populations. 
DCM LVs and ACM RVs showed increased 
proportions of vFB2, enriched for genes that 
modulate ECM composition, turnover, stiffness, 
and fibrotic scarring (8/7), and reciprocally 
decreased proportions of vFB3 that express 
transcripts suppressing fibrosis. Fibrogenic 
genes within vFB2 were highly expressed in 
RBM20, a genotype within our cohort with 
the poorest ventricular function and youngest 
age for heart failure diagnosis and cardiac 
transplantation. Strategies to manipulate pro- 
teins encoded by these genes may attenuate 
the prominent myocardial fibrosis that char- 
acterizes cardiomyopathies. 

Unexpectedly, mural cells (PCs and SMCs) 
showed no increase compared with ECs across 
cardiomyopathies. DEG analysis suggested 
that these cells promoted vascular remodeling 
and dysfunction. PCs diminished PDGFRB 
expression and showed aberrant NOTCH sig- 
naling in vascular beds, whereas genotype- 
selective DEGs in SMCs up-regulated contractile 
genes, augmenting fiber formation. Together, 
these molecular signals may contribute to 
the microvascular dysfunction that occurs in 
cardiomyopathy patients (82) and adversely 
influences ventricular performance. 

Among EC, EC7 demarcated the endocar- 
dium and had the most DEGs in disease com- 
pared with control. Little is known about this 
myocardial layer in heart failure pathogenesis. 
The endocardium forms through dynamic reg- 
ulation of NOTCH, neuregulin (83), and BMP 
(84) signals. Pediatric heart diseases can 
exhibit pathological expansion of the endo- 
cardium (denoted endocardial fibroelastosis), 
which diminishes cardiac performance and is 
associated with progression to heart failure (84). 
Our data suggested similar dysregulation of 
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these molecular pathways in adult-onset car- 
diomyopathies with increased BMP6 and 
NRGI in EC7 and decreased INHBA (53). These 
signals may cause pathological changes in 
endocardium in DCM and ACM hearts and 
contribute to myocardial dysfunction. 

Diseased hearts had increased myeloid cells. 
Expansion of immune cell populations might 
arise from recruitment of circulating cells or 
proliferation of resident cells. In our samples, 
proliferating myeloid cells decreased, as was 
previously observed in other DCM studies 
(85). Myeloid recruitment through the CCR2/ 
CCL2 axis, which is primarily mediated by 
vFB3, was decreased in all samples, with par- 
ticularly dysregulated fibroblast-to-macrophage 
interactions in T7TN. Although distinguishing 
between proliferation versus recruitment late in 
disease was problematic, previous studies in- 
dicated that peak cytokine expression precedes 
the emergence of heart failure (86, 87). Analyses 
of earlier time points during disease progression 
will be important to discern these key signals. 

We used machine-learning strategies to con- 
firm and expand the conclusion that genotype- 
specific signals meaningfully contribute to 
disease pathogenesis. Using GAT to classify 
patients’ genotypes from each cell type, we 
found that CMs, FBs, ECs, and myeloid cells 
provided the highest and most discrimina- 
tory information. Harnessing LV and RV data 
from these cell types, we independently pre- 
dicted the established genotype of each pa- 
tient with high accuracy. Moreover, among 
the four samples with the lowest genotype- 
predictive probability, three samples carried 
two PVs, indicating that the model detected 
subtle transcriptional differences with addi- 
tional influences. 

Although expanding machine-learning mod- 
els to much larger datasets will undoubtedly 
improve accuracy, these early analyses sup- 
ported the conclusion that PVs in different 
genes evoked cell-type- and state-specific re- 
sponses that altered intercellular communica- 
tions and promoted distinct disease pathways. 
We recognize that pathways may converge, 
but even in advanced disease, our data indi- 
cate that genotypes promoted specific tran- 
scriptional signals that likely contributed to 
distinct as well as common manifestations of 
genetic cardiomyopathies. 

Future studies are needed to comprehen- 
sively define the molecular pathophysiology of 
cardiomyopathies and heart failure, including 
assessments of age, sex, and ancestral in- 
fluences; other DCM and ACM genotypes; 
additional cardiac regions; and longitudinal 
analyses to identify initiating and secondary 
processes. We also expect that the deployment 
of strategies that upsample conduction system 
and other rare cell types and the incorporation 
of techniques to characterize the epigenome, 
proteome, and spatial relationships between 
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cell types, states, and gene expression will also 
be highly informative. To promote these ini- 
tiatives, we freely provide all datasets and 
an interactive platform (https://cellxgene. 
cziscience.com/collections/e75342a8-0f3b- 
4ec5-8ee1-245a23e0f7cb) with cell type and 
state annotations. We expect that these re- 
sources will advance mechanistic studies to 
improve treatment of cardiomyopathies and 
enable heart failure prevention strategies. 


Methods summary 


Detailed information on human subject studies, 
experimental methods, data access, codes, 
algorithms, and computational programs used 
in this manuscript is provided in the supple- 
mentary materials (88). 

Human studies were performed using pro- 
tocols that were reviewed and approved by 
the ethics boards of participating institutions. 
DCM and ACM ventricular samples were col- 
lected from genotyped patients undergoing 
mechanical support (7 = 15) or heart trans- 
plantation (7 = 31) and from deceased donors 
with nonfailing hearts as described previously. 
Nuclei from full-thickness LV and RV regions 
were isolated and processed for snRNAseq as 
described previously. Data were mapped to 
the human genome (GRCh38), processed to 
remove doublets and to identify nuclei that 
met high quality standards, and harmonized 
to remove batch effects. Manifolds were con- 
structed using UMAPs for all and individual 
cell types. 

Differential abundance analyses of cell types 
and states were performed using centered log 
ratio transformation including a linear model. 
Differential gene expression between disease 
and control tissues were deduced using a 
pseudobulk approach and EdgeR. Compara- 
tive analyses assessed cell type and cell state 
abundances and differential gene expression 
between disease samples and controls and be- 
tween genotypes. Selected genes with differ- 
ential expression were validated using single- 
molecule fluorescent in situ hybridization or 
quantitative immunohistochemistry. 

We investigated cell-cell communication 
using CellChat. The expression of genes pre- 
viously identified through GWAS of DCM was 
assessed in diseased and control tissues. We 
interrogated transcriptional datasets with 
machine-learning tools to generate cell type 
features that distinguished PVs within the 
same gene from PVs within different genes 
and used these data to generate a GAT. The 
accuracy for GAT analyses of randomly selected 
patient data to assign the correct, clinically 
assigned genotype was assessed. 
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The continuum of Drosophila embryonic 
development at single-cell resolution 
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INTRODUCTION: Single-cell technologies are a 
powerful means of studying metazoan devel- 
opment, enabling comprehensive surveys of 
cellular diversity at profiled time points and 
shedding light on the dynamics of regulatory 
element activity and gene expression changes 
during the in vivo emergence of each cell type. 
However, nearly all such whole-embryo atlases 


 — aD 4 


of embryogenesis remain limited by sampling 
density—i.e., the number of discrete time points 
at which individual embryos are harvested and 
cells or nuclei are collected. Given the rapidity 
with which molecular and cellular programs 
unfold, this limits the resolution at which reg- 
ulatory transitions can be characterized. For 
example, in the mouse, there are typically 6 to 
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Motif activity 


Characterizing the continuum of Drosophila embryogenesis. We collected staged Drosophila embryos from 
overlapping time windows across the first 20 hours of embryogenesis. Then we extracted nuclei and performed 
single-cell RNA sequencing (RNA-seq) and assay for transposase-accessible chromatin using sequencing (ATAC- 
seq) profiling using combinatorial indexing (sci-RNA-seq and sci-ATAC-seq) to comprehensively map expressed 
genes and putatively active regulatory elements. We applied machine learning to infer a continuum of nuclear ages 
that is synchronized across unfolding lineages in absolute time. The continuous nuclear age predictions were used 
to annotate and then link cellular states at nonoverlapping 2-hour intervals, as well as to explore transcriptional 
regulatory dynamics across major cell lineages of embryonic development at fine-scale temporal resolution. 
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24 hours between sampled embryonic time 
points—gaps within which massive molecular 
and morphological changes take place. 


RATIONALE: To construct an ungapped repre- 
sentation of embryogenesis in vivo, we would 
ideally sample embryos continuously. Although 
this is not practical for most model organisms, it is 
potentially possible in Drosophila melanogaster, 
where collections of timed and yet somewhat 
asynchronous embryos are easy to obtain, such 
that, at least in principle, one can achieve ar- 
bitrarily high temporal resolution. Drosophila 
could therefore serve as a test case to develop a 
framework for the inference of continuous reg- 
ulatory and cellular trajectories of in vivo em- 
bryogenesis. Because Drosophila is a preeminent 
model organism that has yielded many advances 
in the biological and biomedical sciences, obtain- 
ing a single-cell atlas of Drosophila embryo- 
genesis is also an important goal in itself. This 
includes its embryonic development, where the 
use of this model in conjunction with powerful 
genetic tools has transformed our understand- 
ing of the mechanisms by which developmental 
complexity is achieved, in addition to uncover- 
ing many general principles of both genetic and 
epigenetic gene regulation. 


RESULTS: We profiled chromatin accessibility 
in almost 1 million nuclei and gene expression 
in half a million nuclei from eleven overlap- 
ping windows spanning the entirety of em- 
bryogenesis (0 to 20 hours). To exploit the 
developmental asynchronicity of embryos from 
each collection window, we applied deep neu- 
ral network-based predictive modeling to more- 
precisely predict the developmental age of each 
nucleus within the dataset, resulting in contin- 
uous, multimodal views of molecular and cellu- 
lar transitions in absolute time. With these 
data, the dynamics of enhancer usage and gene 
expression can be explored within and across 
lineages at the scale of minutes, including for 
precise transitions like zygotic genome activation. 


CONCLUSION: This Drosophila embryonic atlas 
broadly informs the orchestration of cellular 
states during the most dynamic stages in the 
life cycle of metazoan organisms. The in- 
clusion of predicted nuclear ages will fa- 
cilitate the exploration of the precise time 
points at which genes become active in dis- 
tinct tissues as well as how chromatin is 
remodeled across time. 
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Drosophila melanogaster is a powerful, long-standing model for metazoan development and gene 
regulation. We profiled chromatin accessibility in almost 1 million and gene expression in half a million 
nuclei from overlapping windows spanning the entirety of embryogenesis. Leveraging developmental 
asynchronicity within embryo collections, we applied deep neural networks to infer the age of each 
nucleus, resulting in continuous, multimodal views of molecular and cellular transitions in absolute 
time. We identify cell lineages; infer their developmental relationships; and link dynamic changes in 
enhancer usage, transcription factor (TF) expression, and the accessibility of TFs’ cognate motifs. With 
these data, the dynamics of enhancer usage and gene expression can be explored within and across 
lineages at the scale of minutes, including for precise transitions like zygotic genome activation. 


ingle-cell technologies are a powerful 

means of studying metazoan develop- 

ment, shedding light on the emergence 

of cellular diversity and the dynamics of 

gene regulation. However, nearly all such 
atlases of embryogenesis are limited in terms of 
the number of discrete time points and cells 
sampled per time point. Given the rapidity with 
which molecular and cellular programs unfold, 
this limits the resolution at which regulatory 
transitions can be characterized. 

To more completely represent development, 
embryos would ideally be sampled continu- 
ously. Although impractical for most model 
organisms, it is feasible in Drosophila, where 
collections of timed and yet somewhat asyn- 
chronous embryos are easy to obtain, such 
that, in principle, one can achieve arbitrarily 
high temporal resolution. This sharply con- 
trasts with mice, for which there are typically 
6 to 24 hours between sampled time points, 
gaps within which massive molecular and mor- 
phological changes take place (/-4). Although 
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sampling gaps can be computationally filled 
through the continuum of cell states repre- 
sented in single embryos (4, 5), the asynchro- 
nous ages of Drosophila embryos within staged 
collections present an opportunity for more 
bona fide continuity—e.g., with seconds or 
minutes separating the developmental ages of 
consecutive embryos rather than hours or days. 
Moreover, because Drosophila melanogaster is 
a preeminent model organism that has yielded 
many discoveries and general principles of 
metazoan development and gene regulation, 
obtaining a single-cell atlas of Drosophila em- 
bryogenesis is an important goal in itself. 


Results 


We set out to measure chromatin accessibility 
and gene expression from individual nuclei 
spanning a continuum of D. melanogaster 
embryogenesis. Staged embryos were collected 
in 11 overlapping time windows, collectively 0 
to 20 hours, covering the entirety of embryo- 
genesis at 25°C. Overlapping 2-hour collections 
were used to capture the rapid transitions 
during early stages, followed by overlapping 
4-hour collections from 3 hours onward (Fig. 1A). 
From each collection, samples were split and 
separately processed for assay for transposase- 
accessible chromatin using sequencing (ATAC- 
seq) or RNA sequencing (RNA-seq). Although 
we hereafter refer to cells, all data were gen- 
erated from nuclei. Single-cell profiling was 
conducted using three-level combinatorial in- 
dexing (sci-ATAC-seq3 and sci-RNA-seq3) with 
minor modifications (J, 6). 

Sci-ATAC-seq3 and sci-RNA-seq3 libraries 
were sequenced to generate 30 billion and 
6.8 billion raw reads, respectively (fig. S1). 
After deduplication and application of quality 
filters, we obtained chromatin accessibility pro- 
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files for 976,460 cells [single-cell ATAC (scATAC): 
median 5206 nonduplicate reads per cell] and 
gene expression profiles for 547,805 cells 
[single-cell RNA (scRNA): median 399 unique 
molecular identifiers (UMIs) and 274 genes 
detected per cell]. Although our scRNA data 
have fewer UMIs per nucleus than previously 
obtained from Drosophila embryos (7), we pro- 
filed many more nuclei spanning many more 
stages of embryogenesis and complemented 
this with scATAC with a high number of unique 
reads per nucleus. Given the small size of the 
Drosophila embryo, such deep “shotgun cel- 
lular coverage” should effectively sample all 
tissue types during embryogenesis. The data 
did not appear to be confounded by batch ef- 
fects (fig. S2, A to G). 

For both data modalities, integrating and 
visualizing single-cell profiles across all time 
points resulted in branching structures going 
from early to late stages, consistent with in- 
creasing complexity (Fig. 1, B and C). From the 
scATAC data, we identified 110,185 regions ex- 
hibiting accessibility at some point during 
embryogenesis. Collectively, these candidate 
regulatory elements cover 30.4 Mb (22%) of 
Drosophila euchromatin (dm6) and include 
85% of known embryonic enhancers, based on 
overlap with nearly 5000 curated enhancers 
confirmed in transgenic embryos (Fig. 1D) 
(8-10). This, together with the high coverage 
of both bulk deoxyribonuclease (DNase) I 
hypersensitive site (DHS) peaks (87%) and 
scATAC-derived peaks (98%) from 2 to 12 hours 
(11, 12), supports the comprehensiveness of 
this compendium. Similar results were ob- 
tained computing overlaps on a per-base rather 
than per-element basis (fig. S2H). We addi- 
tionally uncovered more than 40,000 distal 
accessible regions not identified in these pre- 
vious studies (Fig. 1D) that are enriched for 
enhancer-associated histone marks, suggesting 
that they are previously uncharacterized de- 
velopmental enhancers (fig. S21). The compen- 
dium also recovered 94% of 8008 extensively 
validated mesodermal cis-regulatory modules (23) 
and 96% of nearly 1 million chromatin immuno- 
precipitation (ChIP)-defined binding sites across 
233 transcription factors (TFs) (74) (fig. S2J). 

In exploring these data, we identified thou- 
sands of genomic regions and transcripts whose 
accessibility and expression levels, respective- 
ly, were strongly correlated with the progres- 
sion of developmental time (Fig. 1, E and F). 
Notably, not all of these correlations were cell 
type specific (fig. S3). The presence of such 
time-dependent elements and transcripts sug- 
gests that a dynamic process is unfolding 
across development, at least some aspects of 
which are cell type specific, whereas other 
aspects appear general to germ layers or the 
entire organism. We reasoned that we could 
leverage these correlations to build a model 
to predict absolute developmental age of any 
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Fig. 1. Single-cell profiling of chromatin accessibility and gene expression 
throughout Drosophila embryogenesis. (A) Eleven overlapping collection windows 
that collectively cover embryogenesis. (B) UMAP visualization of cell-x-peak matrix 
of evenly time-subsampled sci-ATAC-seq nuclei that passed QC. (€) Same as 

(B), but for sci-RNA-seq. (D) Heatmap showing proportion of our scATAC peaks 
overlapping ~5000 curated enhancers (8-10), bulk DHS peaks from 2 to 12 hours 


given nucleus with greater temporal resolu- 
tion than our 2- to 4-hour collection windows. 
Predicting the absolute age of individual nuclei 


In these data, the precise developmental age 
of each sampled nucleus is unknown—only 
the 2- to 4-hour collection window from which 
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it derived. To estimate the age of each nu- 
cleus with greater precision, we fit a series of 
models using either the scATAC or scRNA 
data as input and predicting the center hour 
of the collection window from which any given 
nucleus was obtained (Fig. 2A). Specifically, 
we split a subset of each dataset, evenly sub- 
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(11), scATAC peaks from 2 to 12 hours (12), or annotated TSSs (49). (E) Chromatin 
accessibility, normalized by counts per million reads, across representative regions 
exhibiting time dependence across 11 collection windows. (F) Gene expression of 
representative genes exhibiting time dependence across 11 collection windows. Read 
counts were normalized, multiplied by a scale-factor, log-transformed after the 
addition of a pseudocount, and averaged across all cells within each window. 


sampled with respect to time, into 11 partitions, 
10 of which were used as training data to fit 
either a lasso linear (LL) model or a neural 
network (NN)-based model with 10-fold cross- 
validation across various test parameters. 
After selecting the highest performing param- 
eterization, the NN-based models markedly 
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Fig. 2. Inferring developmental age from cellular state. (A) We fit a NN-based 
model that uses either gene expression or chromatin accessibility to predict the 
center hour of the time window from which each nucleus was sampled. The 
inferred nuclear ages make up a continuum. (B) NN model-predicted 
developmental ages (y axis) of test set nuclei, equally sampled from discrete 
time windows (x axis) and not included in model training. (©) NN model- 
predicted developmental ages (y axis) of bulk RNA-seq samples (15) collected 
from 2-hour windows (x axis). (D) NN model-predicted developmental ages 

(y axis) of bulk DNase-seq samples from either whole-embryo or purified tissues 


collected from 2-hour windows (x axis). (E) Expression of zygotic (left), maternal 
(top right), or silent (bottom right) genes in nuclei from predicted age windows 
in 5-min increments across O to 2 hours of development. (F) Accessibility of 
most variable scATAC peaks from predicted age windows in 1-min increments 
across 0 to 2 hours of development. Labels indicate regions illustrated in (G). 
(G) Examples of cis-regulatory regions known to exhibit dynamic accessibility in 
early embryos (17). (H and I) Examples of time-associated genes, with 
expression values averaged across all nuclei from indicated collection windows 
(H) or from predicted age windows in 10-min increments (I). 


outperformed LL models for both data types 
in predicting the developmental age of nuclei 
within the held-out 11th partition [for NN ver- 
sus LL, mean squared error (MSE): ATAC = 
5.26 versus 8.8, RNA = 2.54 versus 4.72; pro- 
portion correct: ATAC = 0.67 versus 0.53, 
RNA = 0.87 versus 0.65]. We therefore moved 
forward with NN-based nuclear age predic- 
tions for the remainder of this study (Fig. 2B 
and fig. S4). Notably, the scRNA-based model 
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was slightly more accurate than the scATAC- 
based model, likely leading to slightly older 
age predictions during early collection win- 
dows and slightly younger age predictions 
during late collection windows for scATAC 
ages compared with scRNA ages. 

To further assess accuracy, we applied the 
scRNA-derived models to a bulk RNA-seq time 
course of staged embryos in 2-hour intervals 
(15) and found high concordance between pre- 
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dicted and actual developmental age (Fig. 2C). 
The scATAC-derived models were similarly 
able to order a time course of bulk DNase se- 
quencing (DNase-seq) data from either whole 
embryos or specific fluorescence-activated cell 
sorting (FACS)-purified lineages (17) (Fig. 2D). 
To assess predicted ages at much finer time 
scales (minutes rather than hours), we focused 
on genes whose expression is activated at spe- 
cific nuclear cycles during zygotic genome 
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activation (ZGA) (16). Genes turning on during 
ZGA were dynamically up-regulated in asso- 
ciation with predicted nuclear ages (scRNA- 
based; 5-min increments), whereas maternal 
and silent genes were not (Fig. 2E). Early dy- 
namically accessible enhancers and promoters 
could similarly be predicted (scATAC-based; 
1-min increments) (Fig. 2F), opening in the 
same order as previously observed by bulk 
ATAC-seq of hand-picked embryos at 3-min 
intervals (Fig. 2G) (17). To further illustrate the 
value of this framework, we note that pseudo- 
bulk profiles corresponding to collection win- 
dows lead to piecewise expression dynamics 
(Fig. 2H). By contrast, pseudobulk profiles 
based on model-predicted ages yield more 
continuous dynamics (Fig. 21). 

Although there are similarities between the 
goal of our approach and the concept of pseu- 
dotime (78), a key advantage of inferred age 
is that, both in training and prediction, cells 
are anchored to absolute time, which enables 
more interpretable ordering of cellular pro- 
cesses as well as their synchronization across 
lineages. One concern is that contamination 
with embryos whose developmental age falls 
outside the collection window will have ex- 
aggerated confounding effects on early time 
points because older embryos contain vastly 
more nuclei. Consistent with this, our model 
predicted that 2.8% of the ~80,000 scRNA- 
profiled cells from 0 to 2 hours were at least 
4 hours in developmental age. These older 
cells represent the majority of a discrete clus- 
ter in uniform manifold approximation and 
projection (UMAP) space (fig. S5A). Similar 
contamination is also observed with scATAC 
profiles from this early time window (12.7% of 
~20,000 cells; fig. $5, B to D). Clustering and 
visualizing only the cells inferred to be 0 to 
2 hours in age eliminates this developmentally 
advanced cluster (fig. S5E). 


Annotation and inference of diversifying 
developmental trajectories 


To systematically track the emergence and di- 
versification of developmental trajectories, we 
used inferred ages to separately process and 
cluster cells from a series of 2-hour nonover- 
lapping time windows. Clusters were then 
annotated by leveraging stage-matched infor- 
mation on gene expression from thousands of 
in situ hybridizations spanning embryogenesis 
as well as extensive enhancer activity data 
(12, 19, 20) (Fig. 3, A and B). 

Notably, the last few hours of the time 
course had reduced numbers of inferred cells 
(e.g., after 18 hours, 61% fewer than would be 
expected under uniform sampling) and fewer 
identified clusters (fig. S6A). We suspect that 
this may be the result of edge effects of the 
model because we also observe reduced num- 
bers of inferred cells for the first several hours, 
although there they have less effect because 
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the data from early time points lack extensive 
structure. For this reason, we excluded cells 
with an inferred age of >18 hours from this set 
of analyses. 

Here, we use cell state to mean an anno- 
tated cluster at a given time window. Altogeth- 
er, we identified 171 cell states in sci-ATAC-seq 
data and 268 in sci-RNA-seq data across the 
nine time windows, each of which received 
one of 38 cell type annotations for ATAC or 
one of 54 cell type annotations for RNA (tables 
S1 and S2 and Fig. 3, A and B). Across time 
windows, we identified an average of 109 marker 
genes and 2469 marker accessible regions per 
cluster (tables $3 and S4). 

The early stages of Drosophila embryogen- 
esis, represented by our O- to 2-hour time 
window, include 13 rapid nuclear divisions 
within a syncytium that generates 6000 nuclei, 
regulated by maternal genes. At ~2 hours and 
20 min after fertilization, cellularization occurs 
and the zygotic genome is activated (21), fol- 
lowed by gastrulation to generate the three 
germ layers. Our single-cell data recapitulate 
these events, where the earliest time window 
(0 to 2 hours) has two large clusters annotated 
as maternal or unknown. At 2 to 4 hours, the 
maternal cluster is no longer present, and in- 
stead, pole cells and anlage clusters appear. A 
notable expansion in the diversity of cell types 
follows across 6 to 10 hours, matching expec- 
tations for when the major lineages in each 
germ layer are specified (Fig. 3, A and B). 

To follow the emergence and diversification 
of cell lineages, we systematically linked cell 
clusters across developmental time, applying 
similar methods as in earlier studies (3, 22) to 
coembeddings of cells from adjacent non- 
overlapping, inferred time windows (fig. S6, B 
and C). For cells of each state derived from the 
“child” time window, we calculated the me- 
dian proportion of nearest neighbors from 
the “parent” window that were derived from 
each potential parental cell state and treated 
this as the weight of the corresponding edge. 
The maximum edge weights >0.2 were re- 
tained, resulting in acyclic, directed graphs, 
independently generated from scRNA and 
scATAC data (Fig. 3, C and D). Although these 
procedures were generated independently of 
our cell cluster annotations at each time win- 
dow, they overwhelmingly yielded internally 
consistent results. For example, muscle clus- 
ters in one time window connect to muscle 
clusters in the next time window, and the 
same is true for other major lineages (e.g., 
central nervous system, peripheral nervous sys- 
tem, etc.) as embryogenesis proceeds. We note 
that some paths seem to terminate prema- 
turely, potentially because of drastic increases 
in cell number in later embryogenesis, which 
were not matched by corresponding increases 
in our sampling, or because of unknown tech- 
nical or biological factors. More generally, be- 
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cause these are inferences based on cellular 
state rather than lineage tracing, they may be 
prone to certain kinds of error (3). 

To illustrate the potential of these data to 
facilitate exploration of specific lineages at 
finer resolution, we reanalyzed 59,012 cells 
annotated as neuroectoderm using scRNA 
data from 6 to 18 hours (Fig. 3E and fig. S7A). 
This revealed 20 subclusters, including a large 
group of early cells corresponding to the brain 
primordium and neural progenitors that ex- 
press regulators of neurogenesis, such as Notch 
(N) and Delta (DD, and neuroblast temporal 
TFs, such as miranda (mira) and castor (cas). 
Two additional neural progenitor clusters cor- 
respond to sensory progenitors, whereas im- 
mature neurons express low levels of both 
neural progenitor and pan-synaptic genes, in- 
cluding cacophony (cac) and synaptotagmin 
1 (sytI). Mature neurons are marked by higher 
levels of pan- and subtype-specific synaptic 
genes coupled with low or no expression of 
earlier developmental genes. Finally, midline 
cells, consisting of both neurons and glia clus- 
ter together, become evident at 6 to 8 hours; 
using the midline TF single minded (sim) and 
glial immunoglobulin family member wrap- 
per as markers, we can follow them forward 
in time as they mature (fig. S7B). We can also 
follow the maturation of sensory neural pro- 
genitors, marked by shaven (sv), from 6 to 
16 hours (fig. S7B). 

To further explore neuronal diversity, we 
reclustered 6703 mature neurons, revealing 
11 neuronal subtypes, which we manually 
curated (Fig. 3F). Among these, we identify 
four clearly separable sensory cell clusters. 
There are two types of Drosophila sensory 
neurons based on dendritic morphology: 
type I sensilla, which include both external 
sensory (ES) neurons and internal chordo- 
tonal (Ch) neurons, and type II multidendritic 
(MD) neurons. We can clearly distinguish MD 
neurons on the basis of expression of genes, 
such as dendritic arbor reduction 1 (dar), 
which promotes their characteristic branch- 
ing dendrites, and the pseudouridine synthase 
RluA-1, which was recently identified as a 
marker of MD neurons (23) (Fig. 3, F and G). 
Consistent with their nociceptive role, this 
cluster also specifically expresses the mechan- 
ical nociception degenerin/epithelial sodium 
channel subunits pickpocket (ppk) and ppk26. 
Mechanosensory ES neurons are specified by 
the TF hamlet (ham), which is specifically ex- 
pressed in the middle sensory cluster (Fig. 3, F 
and G) (24). The adjacent cluster, likely Ch 
sensory neurons, is identified by expression 
of the mechanosensitive nonselective cation 
channel subunit 20 mechanoreceptor potential 
C(nompC) as well as fate-determinant Rfxv and 
a number of as-yet uncharacterized genes spe- 
cific to this cluster (25, 26) (Fig. 3, F and G). 
The final sensory cluster likely corresponds to 
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Fig. 3. Annotation of diversify- 
ing developmental trajectories. 
(A) UMAP visualization of non- 
overlapping, inferred 2-hour time 
windows for scRNA clusters 
colored by cell state annotation. 
Dashed boxes highlight neuro- 5 
ectodermal clusters. (B) Same as 

(A), but for scATAC data. PNS, 
peripheral nervous system; CNS, 

central nervous system. (C) 
ScRNA-based acyclic directed § 
graph representation of clusters 

linked through nonoverlapping 

time windows. (D) Same as (C), 

but from scATAC data. (E) UMAP 

of scRNA data for ~60,000 § 
annotated neuroectodermal 

cells—i.e., cell states highlighted 

in (A) with dashed boxes, colored §& 
by cluster. (F) UMAP of ~6000 


—— > 


mature neurons, colored by 
cluster. The chordotonal glia 
cluster includes Ch and ES organ H 
glial-like support cells. (G) Dot 
plot showing marker gene 
expression for annotated clusters 
in (F). (H) In situ hybridization 
of stage 16 embryos, showing the 
expression of IncRNA CR3145]1, 
cpx, and CG4328 in the nervous 
system. A tissue marker (elav) is 
provided in the top panel. A 
lateral and ventral embryo view is 
shown for each gene. 
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Ch glial-like support cells based on the expres- 
sion of glial markers, including moody, and 
Cbl-associated protein (CAP) and nompA, which 
promote the development and function of Ch 
support cells, respectively (Fig. 3, F and G). On 
the basis of vesicular neurotransmitter trans- 
porter expression, we also identify two clusters 
of central cholinergic neurons, a glutamatergic 
cluster that likely includes motor neurons, and 
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monoaminergic neurons (Fig. 3, F and G). Fi- 
nally, peptidergic neurons cluster separately 
and were identified on the basis of the expres- 
sion of neuropeptides [ion transport peptide 
UTP)], enzymes involved in their synthesis 
[amontillado (amon)], and receptors [myosup- 
pressin receptor 1 MsR1)] (Fig. 3, F and G). 
We validated the expression of uncharac- 
terized long noncoding RNA (IncRNA) CR31451 
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as enriched in mature neurons as well as two 
genes, complexin (cpx) and CG4328, identified 
in our analysis as enriched in the monoami- 
nergic cluster, which includes midline neurons 
(Fig. 3H). This neuronal subtype enrichment is 
unexpected for cpx, which encodes a presynap- 
tic regulator of synaptic vesicle release, and may 
point to additional requirements for Cpx in 
midline monoaminergic neurons. In the course 
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of exploring these fine neuronal subtypes, we 
also made an unexpected finding regarding 
elav, a classic marker gene for neurons. Spe- 
cifically, we noticed lower-level expression of elav 
in clusters annotated as visceral muscle. Per- 
forming double fluorescent in situ hybridiza- 
tion with a visceral muscle-specific marker gene 
(biniou) confirmed this unexpected finding (fig. 
S7C) and raises the possibility of a potential pre- 
viously unknown role of this well-studied gene. 

This deeper exploration of the neuroecto- 
derm, validating and extending years of re- 
search from many groups, illustrates the depth 
of information that can be obtained from 
these data. We additionally performed a more 
detailed annotation of nonmyogenic meso- 
derm (supplementary note 1). A full explora- 
tion of all lineages represented in these data 
will require a community-wide effort by tis- 
sue experts (as done in this study for neuronal 
diversity). 

In addition to delineating developmental 
trajectories, these data can also capture spatial 
differences arising during developmental pat- 
terning. Previous bulk ATAC-seq on embryo 
halves has shown variability in the accessibil- 
ity of enhancers along the anterior-posterior 
(A-P) axis of the blastoderm embryo (27). Using 
label transfer to map anterior or posterior 
identities from a previous blastoderm dataset 
(72) onto our 2- to 4-hour data, we computed 
a positional accessibility skew score for vali- 
dated enhancers with strict A-P activity (27). 
This indicates that accessibility of most A-P 
enhancers is skewed in the expected anterior 
or posterior cell group (fig. S7D), recapitulat- 
ing the bulk data (27). Notably, we also iden- 
tify differences among enhancers of the same 
gene. For example, in the eve locus, the stripe 
1 enhancer has a much stronger skew for 
anterior accessibility compared with stripe 
2, as has also been previously reported (27). 
Our single-cell data thus capture the biological 
variability in enhancer accessibility along the 
A-P axis, extending previous observations. We 
similarly could transfer labels from our sci- 
RNA-seq clusters to spatial coordinates from a 
spatial enhanced resolution omics sequencing 
(Stereo-seq)-based spatial study of Drosophila 
embryos at 14 to 16 hours and 16 to 18 hours of 
development (28). Using the assigned annota- 
tions of tissues from the spatial study, we 
observe a correspondence with our cluster 
annotations, which again suggests the spatial- 
relevant variability present in these data 
(fig. S7E). 


Tracing dynamic gene modules 
across development 


To further leverage continuous views of un- 
folding trajectories, we next explored the gene 
regulatory modules active in germ layer-specific 
development. We focused on the mesoderm and 
its derivatives as a complex, well-characterized 
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system that we and others have studied pre- 
viously (11, 13, 29, 30). For this, we selected all 
cells corresponding to mesoderm-derived cell 
states, collectively 51,338 (scRNA) and 200,907 
(scATAC) profiles across 4 to 20 hours and 2 to 
20 hours of inferred developmental age, re- 
spectively (Fig. 4, A and B). 

Focusing first on RNA, we selected the top 
2000 most variable genes. After normalizing 
expression values to be comparable across 
time, we used dynamic time warp clustering 
to group genes into four clusters with distinct 
temporal regulation (Fig. 4C, fig. S8A, and 
table S6). These clusters define broad succes- 
sive waves of gene expression during mesoderm 
development (Fig. 4D) and notably exhibit 
similarly ordered waves of chromatin acces- 
sibility (fig. S8, B and D, and supplementary 
note 2). Gene pathway enrichment suggests 
different functional roles for each cluster (fig. 
S8C). Cluster 1 genes (7 = 571) are highly ex- 
pressed from the beginning of mesoderm de- 
velopment (directly after gastrulation; 4 to 
9 hours); are enriched for TFs (P = 1.4 x 107); 
and likely represent a mixture of genes in- 
volved in progenitor cells, mesoderm devel- 
opment, and transcriptional activation (Fig. 4D 
and fig. S8C). Cluster 2 genes (n = 433) peak at 
~9 to 11 hours, during the subdivision of the 
mesoderm into different muscle primordia and 
their subsequent specification. This cluster is 
enriched for genes involved in mesoderm de- 
velopment, including myoblast fusion and 
myotube differentiation, while losing enrich- 
ment for stem cell and self-renewal terms (Fig. 
4D and fig. S8C). By contrast, cluster 3 genes 
(n = 365) initiate expression at ~10 hours and 
steadily increase to the end of embryogenesis, 
whereas cluster 4 genes (7 = 631) only switch 
on at ~15 hours, during muscle terminal dif- 
ferentiation. The last cluster lacks enrichment 
for TFs and rather includes genes involved in 
myofibril assembly and muscle assembly and 
maintenance as well as essential contractile 
proteins for differentiated muscle (Fig. 4D and 
fig. S8C). We validated the spatiotemporal ex- 
pression of five poorly characterized genes by 
in situ hybridization, confirming that they are 
expressed in the mesoderm or muscle at the 
inferred time window (Fig. 4E). 

The temporal and cell type-specific nature 
of these expression signatures for both the 
downstream effector molecules and their up- 
stream regulators should provide the resolu- 
tion to order genes into putative regulatory 
hierarchies. For example, several genes with 
essential roles in muscle differentiation, such 
as myosin heavy chain (Whe), are present in 
clusters 3 and 4. Mhc protein plays a critical 
role in providing muscle-contractile force. Our 
scRNA data show increasing Mhc expression 
along the muscle lineages in cells with later 
embryonic ages (Fig. 4, A and F), matching the 
expression pattern of Mhe. Concomitantly, there 
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is a gradual increase in open chromatin at 
characterized Mhc enhancers at later stages 
along multiple muscle trajectories (Fig. 4G). 

Before the expression of Mhc and other 
muscle differentiation genes, we observe tran- 
sient expression of mesoderm-associated TFs 
(cluster 2; Fig. 4C). One example is Kahuli 
(Kah), a TF associated with muscle develop- 
ment, which has peak expression at 10 hours 
of embryogenesis (cluster 2; Fig. 4, C, D, F, 
and G). To investigate the relationship be- 
tween open chromatin and gene expression, 
we computed gene activity scores, defined as 
the sum of sci-ATAC-seq reads in the gene 
body and the 2 kb flanking the transcription 
start site (TSS). The gene activity scores for 
both Mhe and Kah recapitulate their sequen- 
tial temporal patterns of expression, with Kah’s 
activity signature appearing earlier along the 
mesodermal trajectories compared with that 
of Mhe (Fig. 4, F and G). To determine the 
extent to which we could map the exact order- 
ing of accessibility and expression changes, we 
overlaid the scaled expression values and gene 
activity scores averaged across bins with equal 
numbers of cells (Fig. 4G). Notably for Kah, 
gene expression temporally follows the trajec- 
tory of the corresponding gene activity score 
based on open chromatin, suggesting an order- 
ing where first the gene body becomes acces- 
sible followed by accumulating levels of the 
corresponding transcript; however, this was 
not the case for Mhe, for which expression and 
accessibility increased in tandem (Fig. 4G). 
Kah binds to several characterized Mhc en- 
hancers near the gene’s promoter, as observed 
in bulk ChIP sequencing (ChIP-seq) data (74), 
which suggests a regulatory link between Kah 
and Mhc expression (Fig. 4H). 

To extend this analysis more globally, we 
searched for TF motifs enriched in putative 
enhancers (mesoderm-specific secATAC peaks 
1 to 10 kb upstream of the TSS) of genes be- 
longing to each of the four scRNA mesoderm 
expression clusters. This identified 458 TF 
motif-to-cluster enrichments (g < 1 x 107? and 
presence in >1% of target peaks; table S7) cor- 
responding with 152 unique TFs. Of these, 31 
are TFs whose expression changes along meso- 
derm differentiation and are thus included in 
the expression-based clustering (table S7). These 
31 include many TFs essential for mesoderm 
development, including a number of direct 
target genes of the master regulator Twist (the 
functional ortholog of MyoD) at the beginning 
of mesoderm development (e.g., 2b, en, Ubx, 
and pb), and concordantly expressed in the 
first temporal cluster. These factors have many 
functions, including setting up the segmen- 
tation of the mesoderm, regulating the ex- 
pression of somatic muscle identity genes, 
establishing midgut constrictions in the vis- 
ceral mesoderm, and heart patterning. Other 
examples from the second and third temporal 
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Fig. 4. Dynamic regulation of mesoderm-specific gene modules. (A) UMAP of 
scRNA (left) or scATAC (right) data for all mesodermal cells, colored by inferred 
developmental age. (B) Same as (A), but colored as reprocessed leiden-based 
clusters. (€) Normalized expression of mesoderm genes across inferred develop- 
mental time. (D) Average expression of the gene modules across inferred time. 
(E) In situ hybridization experiments validating temporal expression of selected genes 
with predicted expression in mesoderm and muscle (asterisks indicate see 
supplementary note 3). (F) Same as (A), but expression of Kah (cyan) and Mhc 


clusters are genes required for cell fate spe- 
cification of somatic muscle founder cells (e.g., 
Six4 and ap) and heart development (e.g., tup 
and Lim3). 

We note that this approach may miss the 
contribution of important TFs that were not 
variably expressed in mesoderm. In particular, 
if a TF is variably expressed and has corre- 
sponding variability in motif activity, this TF 
is likely active. However, this does not imply 
that all expressed TFs are active (e.g., there 
may be coactivators or posttranslational mod- 
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ifications that are required). This caveat not- 
withstanding, these analyses highlight the po- 
tential for further discovery of coregulated 
gene modules related to distinct germ layers 
or cell types. 


Nominating stage- and cell type-specific 

TF regulators 

We next investigated whether we could lever- 
age the diversity of cell states across embryogen- 
esis to infer which TFs drive specific programs 
of cell type differentiation. For this, we used all 
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(purple) is overlaid. Points from cells that express both Kah and Mhc are colored 
dark blue. (G) Comparison of gene activity score (solid line) and gene expression 
(dashed line) over the continuum of inferred developmental age for Kah (cluster 2) and 
Mhce (cluster 3) in mesoderm-annotated cells. Gene activity scores and expression 
were binned into 100 equal partitions by inferred age, averaged, and scaled to O to 
1 with min-max values. (H) Chromatin accessibility profile surrounding Mhc for 
pseudobulk mesoderm cells from 6 to 16 hours inferred time in 2-hour increments, 
along with Kah ChIP-seq generated from O- to 16-hour whole embryos (14). 


scATAC clusters at all time points (in contrast 
to the scRNA-focused cluster analysis above) 
and searched for differential enrichment of TF 
position weight matrices (PWMs) within each 
cluster’s open chromatin regions. 

We first characterized enrichments across 
clusters from the 10- to 12-hour time window 
based on predicted time (Fig. 5A). Encourag- 
ingly, hierarchical clustering of the enrichment 
profiles of all associated PWMs grouped each 
cluster roughly by germ layer (this was also 
observed in other time windows; fig. S9A). The 
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Fig. 5. Integration of scRNA and scATAC data to identify TFs with potential 
regulatory roles across differentiating tissues and developmental time. 

(A) Heatmap with averaged chromatin accessibility differences associated with 
the 50 most variable TF-specific motifs from all cells in annotated ATAC-seq 
clusters from 10 to 12 hours. Arrows indicate TFs discussed in the main text. 
(B) Correlation between expression and motif-associated accessibility grouped 
by expression activation- or repression-associated GO categories. TFs in GO 
pathways for gene activation are linked to increasing chromatin accessibility. 
(C) Comparison of gene expression (y axis) and motif-associated chromatin 
accessibility (x axis) across NNLS-linked clusters for the TFs Sage (left), GATAe 
(middle), and Awh (right). Each TF's corresponding PWM is inset in each plot, 
with the size of each base scaled by information content. (D) Heatmaps of 


Linked Zelda bound regions 
Zakla responsive genes (295) 


& 120 180 
Model time (mins) 
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estimated effects of gene expression at predicting motif-associated chromatin 
accessibility changes through time in different germ layers. Displayed TFs 
had three or more consecutive time windows with a significant (P < 1 x 107°) 
and sign-consistent effect. Arrows indicate TFs discussed in the main text. 
(E) Heatmap of expression at Zelda-responsive genes (right) and aggregated 
chromatin accessibility (left) at their Zelda-bound cis-regulatory regions 

(38, 39). Values were averaged in 1-min windows over 0 to 3 hours of 
development. The red and blue bars to the left indicate two temporal clusters 
of expression of Zelda-responsive genes. (F) Smoothed average expression 
and accessibility for the two Zelda temporal clusters from (E). (G) Proportion of 
accessible regions from (E) that are bound by Zelda in clusters 1 and 2 in 
ChIP-seq data (39) from different nuclear cycles (NCs). 


20 40 60 8 1 
Zelda bound sites (36) 


nonmyogenic mesoderm (fat body) and myo- 
genic mesoderm (somatic muscle) cluster to- 
gether (Fig. 5A). Open chromatin regions in 
the myogenic clusters are enriched in motifs 
for many TFs known to play a role in muscle 
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development, including Mef2 and Fork head 
(Fkh) TFs. The myogenic clusters also appear 
close to two neuronal clusters (Fig. 5A), which 
is driven by shared motif enrichment with 
neuroectoderm and glial cells, particularly 
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many C2H2 zinc finger TFs, including Btd, 
CG7368, Crol, Sr, and Darl. Many of these 
factors have known roles in neuronal devel- 
opment (e.g., Darl), whereas Stripe (Sr) is es- 
sential for muscle tendon cell fate and muscle 
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attachment in the epidermis at late stages of 
embryogenesis (37). 

Because members of the same family of TFs 
typically recognize similar motif sequences 
(e.g., GATAe, GATAd, and pnr), it is often diffi- 
cult from motif analysis alone to pinpoint the 
responsible TF. To address this, we leveraged 
our scRNA data to identify the most likely 
active TF on the basis of its expression within 
the clusters among all factors that share the 
same motif binding pattern. First, we used a 
regression-based framework to integrate the 
scATAC and scRNA datasets and identify links 
between the different cell clusters (J, 6). Spe- 
cifically, we adopted a nonnegative least square 
(NNLS) matrix factorization approach that 
decomposes expression data as a mixture of 
components derived from proximal gene ac- 
tivity scores generated from the scATAC data. 
Despite possible temporal differences between 
accessibility and expression, NNLS identifies 
stronger links between clusters from the same 
2-hour window compared with those from ad- 
jacent 2-hour windows (fig. S9B). We also in- 
ferred NNLS links in the opposite direction by 
decomposing proximal gene activity scores by 
gene expression associated with scRNA clus- 
ters. For each cluster of a given data type, the 
result of NNLS factorization is a mixture pro- 
portion of clusters from the other data type, 
where a higher value represents a stronger as- 
sociation between the scRNA and scATAC 
cluster (fig. S9, C to F, and table S8). This factor 
decomposition approach resulted in a strong 
linkage (NNLS-mixture coefficient of >0.1) of 
120 cell state clusters present in the same in- 
ferred time windows, with most of the strongly 
linked clusters being from 4 to 6 hours onward. 
Upon manual inspection, many linked seATAC 
and scRNA clusters, which had been indepen- 
dently annotated, are from matching tissues. 
For example, from the 10- to 12-hour window, 
the epidermis cluster (cluster 0) in scATAC data 
was matched to the epidermis (cluster 3) in 
scRNA data. Altogether, of 21 ATAC clusters 
from the 10- to 12-hour window, 16 had a linked 
RNA annotation with a NNLS correlation value 
>0.1, of which 14 were between comparable 
tissue annotations. 

These integrated scRNA and scATAC clus- 
ters, which span 0 to 18 hours of embryogen- 
esis, enabled a more direct analysis of the role 
of specific TFs in different cell types’ differen- 
tiation. We reasoned that active TFs should be 
more highly expressed in cell types for which 
they have a functional role, and their associ- 
ated PWM should be more enriched or de- 
pleted in accessible regions when the TF is 
activating or repressing expression (6). In line 
with this, correlation values between motif- 
associated accessibility and gene expression 
were shifted toward more positive values for 
TFs annotated [by gene ontology (GO)] as ac- 
tivators and toward more negative values for 
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annotated repressors (Fig. 5B and table S9), a 
trend also observed in human fetal tissues (6). 
This approach of linking TFs’ cluster-specific 
expression and motif enrichments allowed us 
to nominate TFs as active at specific times in 
specific tissues (Fig. 5C). For example, this 
analysis predicts a specific role for Sage in 
salivary gland development, as the salivary 
gland is the only cell type exhibiting both high 
expression of the sage transcript and high ac- 
cessibility of the Sage-associated PWM (Fig. 
5C, top). This finding matches the essential 
role for sage in salivary gland development, as 
determined by genetic loss-of-function analy- 
sis (32). Similar predictions were made for 
GATAe in the midgut at 16 to 18 hours and 
Awh in the epidermis at 14 to 16 hours (Fig. 5C, 
middle and bottom), matching the functional 
role for both TFs in midgut endoderm (33) and 
epidermis (34, 35) development, respectively. 

To expand this analysis and systematically 
nominate TFs that potentially drive germ layer- 
specific differentiation programs, we fit a lin- 
ear model that predicts a TF’s motif-associated 
chromatin changes from an estimated effect of 
an interaction term that includes the expres- 
sion level of the TF in a specific germ layer and 
time window. Our model’s effect estimates can 
identify TFs with specific motif activity in par- 
ticular germ layers and suggest time windows 
from which a TF initiates its activity. For ex- 
ample, the model refined the role of Sage as 
becoming active in the ectoderm germ layer 
specifically from 10 to 12 hours onward and 
the activity of GATAe initiating in the endo- 
derm from 8 to 10 hours onward (Fig. 5D, top). 
Such a model encompassing germ layers across 
development time may also identify additional 
likely coactive TFs. For example, in addition to 
Sage, we found Fkh to be both coexpressed 
and coactive in the ectoderm—a TF reported 
to act together with Sage to activate salivary 
gland-specific genes (36). 

This analysis also generated additional in- 
teresting findings for other time points and 
germ layers [e.g., Fruitless (Fru); supplemen- 
tary note 4 and Fig. 5D]. Altogether, from eight 
high-level germ layer-associated tissue anno- 
tations and 316 TF motifs tested, we identified 
1258 significant (Benjamini-Hochberg-corrected 
P<1x 10°) TF-to-tissue relationships having 
both associated expression and chromatin ac- 
tivity at one or more of the nine time windows 
assessed. We note that in time windows with 
fewer clusters, the association effect estimates 
are susceptible to outliers and should be inter- 
preted with caution. Notwithstanding this 
caveat, these putative assignments represent an 
extensive resource for future studies (table S10). 

To demonstrate the potential of our approach 
to discover previously unknown putative roles 
for TFs, we selected four genes and validated 
whether they were expressed in the linked 
germ layer by fluorescent in situ hybridization. 
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Although these genes were inferred to have 
effects in multiple germ layers, their function 
in either mesoderm (CG5953 and CG1I617) or 
neuroectodermal tissues (Ets65A and CG12605) 
was poorly characterized. We confirmed that 
these factors are in fact expressed in the tissue 
and time window predicted by our data (fig. 
$10), suggesting potential roles for these TFs 
in mesoderm and neuronal development. 

To complement the NNLS, we applied a re- 
cently developed tool, FigR (37), to further 
facilitate gene regulatory network (GRN) re- 
construction. Because multi-omic ATAC-RNA 
data from the same cell are required for this 
task, we first integrated our two independent 
assays for all cells from 10 to 12 hours using 
canonical correlation analysis (CCA), identify- 
ing the most likely ATAC-RNA cell pairs using 
geodesic distance-based pairing (37) within 
the common CCA space. Using these pairs as 
input for GRN inference with FigR, we linked 
ATAC peaks to their target genes based on 
peak-to-TSS accessibility correlation and then 
computed TF motif enrichments for the linked 
regions, which, together with the TF expression- 
accessibility correlation, allowed us to define 
hundreds of putative activators and repressors 
at this embryonic stage (fig. SIIA). Ranking 
the TFs by their regulation score (fig. S11B) 
nominated many activators and repressors 
that we also identified in the NNLS analysis 
above, including 1(3)neo38, Lim3, lola, fkh, and 
Sru (Fig. 5D). Focusing on the targets of the 
regulatory networks across all cells at 10 to 
12 hours, we found a large set of genes that 
appear to be extensively regulated (209 genes 
with >10 linked regulatory regions) (fig. S11C). 
We then used the inferred TF activities to 
explore the factors acting on these genes and 
their mode of regulation. For example, tup, a 
TF gene required for heart development, under- 
goes extensive self-regulation (highest motif- 
RNA correlation) besides being positively 
regulated by the pan-muscle TF Mef2 and 
repressed by Run and Opa (fig. S11D). Another 
top-ranking gene, chinmo, an essential neuro- 
nal TF, is activated by other nervous system 
TFs, such as Lim1 and Onecut, and is nega- 
tively regulated by Fru (fig. SIIE), which we 
also identified as a neuroectoderm-specific 
repressor in our NNLS-based analysis (Fig. 
5D and supplementary note 4). 

Finally, we sought to exploit the fine-grained 
resolution of inferred nuclear ages to explore 
the dynamics of an early pioneer TF, Zelda, 
in regulating chromatin opening followed by 
transcription during ZGA. We recovered the 
expression of a set of genes that are Zelda de- 
pendent during ZGA (38) and, for each gene, 
aggregated accessibility at the linked Zelda- 
bound regions (39) in intervals of 1 min across 
0 to 3 hours of embryogenesis (Fig. 5E). Clus- 
tering of gene expression identified two broad 
temporal clusters—a first group of early genes 
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and a second group whose expression increases 
later, after ~1.5 hours of embryogenesis. Not- 
ably, although accessibility at the Zelda-bound 
regions linked to the early cluster seems to 
mirror the temporal expression, regions linked 
to the late expression gene cluster gain acces- 
sibility much earlier, almost as early as the first 
cluster, which suggests that Zelda is opening 
these regions for future activation (Fig. 5F). To 
verify whether accessibility is reflective of Zelda 
binding, we retrieved Zelda occupancy by nu- 
clear cycle (39), which confirmed that >70% of 
regions in both temporal clusters are already 
occupied by Zelda at nuclear cycle 8 to 9, re- 
gardless of the associated gene expression (Fig. 
5G). Moreover, we found a partial Clamp TF 
motif match within the second temporal clus- 
ter (and no match for the first cluster of a TF 
that is also expressed), which corroborates its 
Zelda-paired role during ZGA (40). These re- 
sults suggest that Zelda establishes chromatin 
accessibility at a large set of regulatory regions 
in the early embryo, independently of future 
gene expression, in agreement with its well- 
known role as a pioneer factor. In some cases, 
Zelda possibly also functions as the activator 
of gene expression (cluster 1), whereas in others 
it retains a pioneering role, and the gene’s ex- 
pression is induced by later TFs (cluster 2). 


Discussion 


This continuum of Drosophila embryogenesis 
builds on our previous work generating sci- 
ATAC-seq from three nonoverlapping time 
windows of embryogenesis (72) and comple- 
ments other studies performed on specific 
tissues (30, 41-46) as well as scCRNA from entire 
embryos at one specific stage (7) or on dis- 
sected tissues from adults (47). Despite the 
growing use of single-cell assays to generate 
large-scale atlases, characterizing fine-scale 
dynamics of chromatin accessibility and gene 
expression across developmental time remains 
a challenge. The large number of cell types and 
even greater number of cell states and branch 
points during embryogenesis requires exten- 
sive cell sampling at continuous stages to cap- 
ture regulatory transitions, especially for rare 
cell types. This is very difficult if not essentially 
impossible to obtain in most model organisms. 

In this work, sampling embryo collections 
from overlapping 2- to 4-hour time windows, 
coupled with NN-based inference of more pre- 
cise nuclear ages, enabled continuous repre- 
sentation of Drosophila embryonic development. 
Other studies have attempted a similar order- 
ing of embryos by developmental time over a 
2-day window of mouse development (4). How- 
ever, because only dozens rather than thou- 
sands of mouse embryos can practically be 
sampled, reliable inference at the scale of hours 
or minutes is challenging. Similarly, cell age 
was inferred in Caenorhabditis elegans using 
an independent time series of bulk RNA-seq 
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from whole embryos (48). However, relying 
on such whole-embryo bulk data to predict 
developmental age in single cells risks inaccu- 
rate aging of rare or transient cell types, es- 
pecially for more complex organisms. 

Computationally, our NN-based inference of 
developmental age bears some similarity to 
the concept of pseudotime. As originally pro- 
posed, pseudotime aims to serve as “a quantita- 
tive measure of progress through a biological 
process” (78). Analogously, our inferred develop- 
mental age tracks the progression of nuclei 
through development. However, the advantage 
of pairing an experimental design including 
overlapping yet tightly defined time windows 
with temporal ordering is that we can anchor 
inferred ages to fixed time points, which can 
potentially lead to a more accurate represen- 
tation of developmental age for complex cel- 
lular trajectories. Put another way, inferred 
ages are interpretable as units of absolute time 
that are synchronized across all tissue trajec- 
tories. With such a continuum of cellular states, 
we can begin to infer cell type trajectories that 
more closely capture the continuous processes 
of cellular differentiation unfolding within a 
complex, developing multicellular organism. 

There remain further possible improvements 
to our experimental framework. The alignment 
or anchoring to real time could be refined 
with sampling of more tightly staged win- 
dows. Multi-omic methods for characterizing 
multiple data types from the same nuclei may 
facilitate a joint model that can link paired 
gene expression and chromatin accessibility 
(and other modalities) to developmental age 
inference. There are cases where technical 
features of the data can lead to increased un- 
certainty of model predictions. For example, 
we found that cells annotated as germ cells, 
from the first collection time window, or with 
low read count were associated with greater 
prediction error (fig. SI1F). Moving forward, 
we suggest caution for interpreting findings 
solely on the basis of inferred nuclear ages 
from clusters with these features. 

The extensive scATAC data, with deep cover- 
age across almost a million cells, likely captured 
most regulatory elements active during em- 
bryonic development and provides a compre- 
hensive resource of potential enhancers for 
almost any cell type in the embryo. By con- 
trast, our scRNA data had relatively low unique 
reads per cell and will likely miss some dif- 
ferentially expressed genes in specific cell 
types. As a result, some delicate analyses re- 
main challenging. For example, we found tran- 
scriptional velocity estimates to be unstable 
with sparse scRNA data, although this issue 
was mitigated by constructing metacells before 
velocity analysis (fig. S11G), which may be use- 
ful for pursuing targeted questions. In scATAC 
data, we were able to distinguish XX versus 
XY nuclei from the proportion of chrX-mapped 
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reads (fig. S11H); however, this was challenging 
for the scRNA data, again as a result of data 
sparsity. These shortcomings are to some degree 
compensated by the large number of cells 
profiled, as shown by our ability to recapit- 
ulate aspects of previously documented hetero- 
geneity even for highly dynamic or restricted 
phenomena—e.g., ZGA (Fig. 2E). 

Overall, this Drosophila embryonic atlas pro- 
vides broad insights into the orchestration of 
cellular states during the most dynamic stages 
in the life cycle of the organism. Our results 
represent a rich resource for understanding 
precise time points at which genes become 
active in distinct tissues as well as how chro- 
matin is remodeled across time. The anno- 
tation of cell types within these data is an 
ongoing process and one that is much more 
challenging at early and mid-stages of embryo- 
genesis as compared with late time points or 
in adults with differentiated tissues. A com- 
prehensive annotation of embryonic cell states 
will require a collective effort from the Drosophila 
community. To support these ongoing efforts, 
we provide information on expression and 
peaks from all clusters (Fig. 3, A to D) in ad- 
dition to all intermediate and raw data for 
further exploration. Although larval stages re- 
main insufficiently profiled, we hope that these 
data and methods, together with the recently 
released large-scale adult atlas (47), bring us 
closer to the community-wide goal of a multi- 
modal Drosophila atlas spanning a continuum 
from zygote to adulthood. 


Materials and methods summary 


A detailed version of the materials and meth- 
ods is provided in the supplementary mate- 
rials. In brief, D. melanogaster embryos were 
acquired for each of 11 collection windows, 
and then each pool of embryos was divided, 
with each half being extracted and fixed for 
either sci-RNA-seq3 or sci-ATAC-seq3. Librar- 
ies were sequenced deeply, and the resulting 
reads were mapped to dm6 and then pro- 
cessed with a uniform processing pipeline 
that included quality control (QC) filters for 
low read depth or high proportions of reads 
mapping to the mitochondria or ribosomal 
genes and extensive doublet removal. Between 
the two data modalities, we obtained profiles 
for ~1.5 million nuclei, although unique read 
depth per nucleus was considerably lower for 
scRNA than scATAC data. 

Using the center hour of the collection win- 
dow, we used several machine learning ap- 
proaches to fit a model that could infer the age 
of a nucleus with either gene expression or 
chromatin accessibility information. Both LL 
regression and neural networks were fitted to 
the same training data, with a held-out subset 
used for model validation and comparison. 
Given its consistently superior performance, 
we then relied on specific parameterizations of 
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NN model-inferred ages to reposition nuclei 
in time. To zoom into fine-scale time points, 
we binned data by small increments to explore 
the regulatory dynamics of ZGA. Then, using 
2-hour adjacent windows of cells, we com- 
puted clusters of similar cells and performed 
extensive manual review to annotate each 
cluster’s likely germ layer and cell type. We 
then used an iterative approach for construct- 
ing an acyclic tree of differentiation by iden- 
tifying the likely precursor cluster for each 
cluster in a given time window. 

Neuroectoderm was iteratively analyzed for 
deeper annotation of neuronal subtypes, 
whereas mesoderm was picked for analyses 
focused on identifying coregulated genes and 
accessible regions, which were then subjected 
to ontology and TF motif enrichment analysis. 
To connect scATAC cell clusters with scRNA 
cell clusters, we used a regression-based ap- 
proach (NNLS). Such connections between 
ATAC and RNA clusters enabled a series of 
analyses, such as correlating expression with 
motif accessibility, applying GRN analysis 
pipelines, etc. 

Several additional analyses were performed. 
We used probabilistic label transfer to map 
likely cluster annotations from these data to 
spatial information from patterned DNA nano- 
balls. We also found it is possible to infer 
the sex of cells from the proportion of chrX- 
mapped scATAC reads using a Gaussian mix- 
ture model to classify cells. Although RNA 
velocity was challenging to apply to sparse 
scRNA data, it yielded more sensible results 
when subsets of cells were first aggregated to 
metacells. 

The expressions of several genes were veri- 
fied by fluorescent in situ hybridization: specif- 
ic neuronal genes active in identified clusters, 
unexpected coactivity of the elav with binou, 
genes active at specific mesoderm time points, 
and putative active TFs with less-characterized 
roles in tissue development. 

Raw data are available through the Gene 
Expression Omnibus (GEO). Additional scripts 
and intermediate files, including bigwigs and 
a custom web app to visualize UMAPs, are 
available through our data-sharing website. 
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Mitochondrial remodeling and ischemic protection by 
G protein-coupled receptor 35 agonists 


Gregory A. Wyant", Wenyu Yut, Ilias P. Doulamis®, Rio S. Nomoto*, Mossab Y. Saeed?, 
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Kynurenic acid (KynA) is tissue protective in cardiac, cerebral, renal, and retinal ischemia models, but 
the mechanism is unknown. KynA can bind to multiple receptors, including the aryl hydrocarbon 
receptor, the a7 nicotinic acetylcholine receptor (a7nAChR), multiple ionotropic glutamate receptors, and 
the orphan G protein-coupled receptor GPR35. Here, we show that GPR35 activation was necessary 
and sufficient for ischemic protection by KynA. When bound by KynA, GPR35 activated G;- and G,2/;3- 
coupled signaling and trafficked to the outer mitochondria membrane, where it bound, apparantly 
indirectly, to ATP synthase inhibitory factor subunit 1 (ATPIF1). Activated GPR35, in an ATPIF1-dependent 
and pertussis toxin-sensitive manner, induced ATP synthase dimerization, which prevented ATP loss 
upon ischemia. These findings provide a rationale for the development of specific GPR35 agonists for the 


treatment of ischemic diseases. 


ynurenic acid (KynA) has been shown 

to protect tissues in preclinical organ 

ischemia models and to mediate cardio- 

protection in a mouse model of remote 

ischemic preconditioning, a phenomenon 
whereby an ischemic tissue confers ischemic 
protection to other tissues at a distance (-5). 
Mice, however, are a suboptimal model for 
cardiac physiology studies because of their 
small size, extremely rapid heart rates, and 
short life spans (6). Therefore, we tested KynA 
in rabbits. Pretreatment of rabbits with KynA 
decreased infarct size when their excised, beat- 
ing, and perfused hearts (Langendorff prep) 
were subjected to a brief period of ischemia 
followed by reperfusion to model ischemia/ 
reperfusion (I/R) injury (fig. $1, A to D). KynA 
also increased cardiac function after I/R, as 
measured by increased fractional shortening, 
increased left ventricular contractility (dP/dt 
max), and left ventricular peak developed 
pressure (LVPDP) during reperfusion (fig. S1, 
E to H). 

We tested which of the putative KynA re- 
ceptors (7-13), if any, mediated KynA’s tissue- 
protective effects. Because KynA protects 
isolated hearts against ischemia, we focused 
on the aryl hydrocarbon receptor (AhR) and 
G protein-coupled receptor 35 (GPR35) rather 
than on ionotropic glutamate receptors, which 
are linked to neurotransmission. KynA pro- 
moted the binding of the AhR to its partner, 
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aryl hydrocarbon receptor nuclear transloca- 
tor (ARNT), and activated AhR-responsive 
transcription (fig. $2, A and B). However, car- 
diac AhR loss in mice did not block KynA- 
induced cardioprotection in the setting of 
I/R injury in vivo (fig. S2, C and D). Moreover, 
stable expression of a constitutively active AHR 
(AHR-CA) cardiomyocytes derived from human 
induced pluripotent stem (hIPS) cells did not 
promote their survival when they were tran- 
siently deprived of oxygen and nutrients and 
then re-exposed to oxygen and nutrients to sim- 
ulate I/R injury ex vivo (fig. S2, E to G). There- 
fore, AhR activation is neither necessary nor 
sufficient for KynA-induced cardioprotection. 

GPR35 is broadly expressed, and its abun- 
dance is increased after hypoxia under the 
control of the HIFoa transcription factor and 
also in human heart failure (fig. $3, A to C) 
(14, 15). Moreover, GPR35 is linked to tissue 
protection, although studies disagree as to 
whether GPR35 agonism or antagonism is 
protective (16-18). Whether KynA is an au- 
thentic endogenous GPR35 ligand is also con- 
troversial, partly because its binding affinity 
for GPR35 varies across species and, in some 
species, is relatively low (in the micromolar 
range) (19). 

We confirmed that radiolabeled KynA bound 
to purified human and mouse GPR35 (Fig. 1A). 
This was specific, because radiolabeled KynA 
did not bind to a control G protein-coupled 
receptor (GPCR), GPR160, and its binding to 
GPR35 was prevented by excess unlabeled KynA 
and the known GPR35 activators pamoic acid, 
zaprinast, and lodoxamide, but not by the KynA- 
derived metabolite quinolinic acid or trypto- 
phan (the parent molecule from which KynA 
is derived) (Fig. 1B) (20-22). KynA activated 
G, and Gyo/13 in a GPR35-dependent manner, 


as determined by G; and Gjo,;3 recruitment, 
activation of the protein kinase Rho, and 
suppression of forskolin-induced cAMP abun- 
dance and downstream signaling by protein 
kinase A (PKA) (fig. S4, A to E). 

We sought to create a GPR35 mutant that 
could not bind to KynA so that it could be used 
as a specificity control. We made a series of 
GPR35 missense mutants in which specific 
residues within GPR35 transmembrane do- 
mains III and IV were converted to alanine, 
motivated by previous homology modeling 
and ligand-docking studies that implicated 
this region (particularly specific arginine res- 
idues within this region) in ligand binding 
(23, 24). One such variant, GPR35 R151A, failed 
to bind to radiolabeled KynA in assays using 
GPR35-expressing cell membrane fractions or 
highly purified GPR35 (Fig. 1, C to F). Similarly, 
KynA bound to wild-type GPR35, but not GPR35 
R151A, in thermal shift assays based on thermo- 
stabilization upon ligand binding (Fig. 1G). 
Because these studies used GPR35 that was 
purified from mammalian cells (293T cells), 
we cannot yet formally exclude that additional 
cellular proteins are required for the binding 
of KynA to GPR35. Many GPCRs, including 
GPR35, are internalized upon activation (25). 
We confirmed that wild-type GPR35, but not 
GPR35 R151A, was internalized in early endo- 
some antigen 1 (EEA1)-containing endosomes 
in cells treated with KynA or the GPR35 agonists 
pamoic acid or zaprinast, but not in cells treated 
with quinolinic acid (Fig. 1H and fig. S5A) 
(11, 15). GPR35-containing endosomes also 
contained transferrin receptor (TfR), which 
is normally present on plasma membranes, 
consistent with the GPR35 endosomes par- 
ticipating in endosomal trafficking (fig. S5B). 
B-arrestin localization was not regulated by 
KynA or by the GPR35 agonist pamoic acid in 
the cardiomyocytes that we studied, although 
it is regulated by many other GPCRs that are 
internalized upon activation (fig. S5C) (25). 

Cardioprotection by KynA administered either 
2 or 24 hours before injury in vivo or 10 or 
30 min before I/R in ex vivo hearts was com- 
pletely abrogated in GPR35/~ mice that we 
made using CRISPR/Cas9-based gene editing 
of mouse embryos (Fig. 2, A and B, and fig. S6A) 
(26). GPR35 loss did not alter, either positively or 
negatively, the induction of hypoxia-inducible 
transcription factor lo. (HIF1a) after cardiac I/R 
injury (fig. S6B). In cell culture experiments per- 
formed with murine neonatal cardiomyocytes, 
KynA, in a GPR35-dependent manner, de- 
creased resting oxygen consumption and, after 
simulated I/R injury, decreased mitochondrial 
reactive oxygen species production and pre- 
served mitochondrial membrane potential (fig. 
S6, C to E), changes that would be predicted to 
promote survival after I/R. 

The drug FG-4592 (roxadustat), which sta- 
bilizes HIF lo by inactivating the egg laying 
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Fig. 1. KynA binds GPR35 and promotes GPR35 internalization. (A, B, and 
F) (°H]-KynA-binding assay using tandem affinity-purified versions of the 
indicated high-density lipoprotein-reconstituted hemagglutinin (HA)-FLAG- 
tagged GPCRs in the presence or absence of the indicated unlabeled competitor 
compounds (10 mM). (C and D) [°H]}-KynA binding (C) and immunoblot 

(D) assays of isolated membrane fractions from 293T cells expressing FLAG- 
GPR160 or FLAG-GPR35 (wild-type or mutant). Where indicated, 1 and 

10 mM unlabeled KynA was used as a competitor. (E and G) Commassie 


defective 9 (EGLN) prolyl hydroxylases, was 
shown to be cardioprotective in preclinical 
models (27-29). Cardioprotection is likely an 
on-target effect of FG-4592 because similar 
effects have been observed with other struc- 
turally unrelated EglN inhibitors and after 
systemic or cardiac-specific genetic inactivation 
of EgIN1, which is the primary regulator of 
HIF la stability (7, 28, 30-32). Although EglN 
inactivation is thought to cardioprotect by 
stabilizing HIF1oa, systemic inactivation of EgIN 
also increases the plasma concentrations of 
2-oxoglutarate (2-OG), which is converted to 
KynA by the liver and released into the cir- 
culation (1). Cardiac protection by FG-4592 in 
vivo was partially abrogated by loss of GPR35 
(Fig. 2A). By contrast, FG-4592 was cardio- 
protective in wild-type and GPR35/~ hearts 
when administered ex vivo, a setting in which 
KynA would not be produced, before I/R 
(Fig. 2, B and C). Protection by FG-4592 was 
observed when given 30 min, but not 10 min, 
before I/R, which correlated with successful 
induction of HIFla 30 min after FG-4592 ad- 
ministration (Fig. 2C). By contrast, KynA pro- 
tected more rapidly (Fig. 2B). These results 
suggest that cardiac protection in vivo by 
roxadustat is caused by the direct effects of 
HIFa and the indirect effects on GPR35 me- 
diated by endogenous KynA. 

Pertussis toxin, a classic G; and G, protein 
family inhibitor, abolished cardiac ischemic 
preconditioning in rats (33) and likewise in- 
hibited KynA-induced cardiac ischemic pro- 
tection in vivo, suggesting that intact G; or G, 
family signaling by GPR35 is required for 
KynA-induced ischemic protection (Fig. 2D). 
Multiple structurally unrelated GPR35 agonists, 
including lodoxamide, zaprinasat, and pamoic 
acid, were cardioprotective in vivo and ex vivo, 
whereas quinolinic acid, which fails to activate 
GPR35, was not (Fig. 2, E and F). Where tested, 
cardioprotection by these other GPR35 agonists 
was GPR35 dependent (Fig. 2A). 

To formally test whether KynA binding to 
GPR35 is necessary for KynA-induced ischemic 
protection, we used CRISPR-Cas9 to generate 
hIPS cells lacking GPR35 and then infected 
them with a lentivirus encoding wild-type 
GPR35, GPR35 R151A, or the empty vector 
(Fig. 2G). These cells were then induced to 
become cardiomyocytes and subjected to sim- 
ulated I/R ex vivo. KynA pretreatment pro- 
tected against I/R-induced injury, and this 
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protection was lost in the absence of GPR35. 
Reexpression of the wild-type GPR35, but not 
GPR35 R151A, in the GPR35/- cardiomyo- 
cytes restored protection by KynA (Fig. 2H). 
Thus, GPR35 mediates KynA-induced ische- 
mic protection in both mouse and human 
cells, and this protection requires the bind- 
ing of KynA to GPR35. 

GPR35 is annotated to bind to the mito- 
chondrial protein ATP synthase inhibitory fac- 
tor subunit 1 (ATPIF1) and ATP synthase in 
the BioPlex protein-protein interaction database 
(34). To test whether GPR35 can associate with 
mitochondria, we reintroduced wild-type or 
R151A GPR35 into GPR35~/~ murine neonatal 
cardiomyocytes by lentiviral infection. Treat- 
ment of the wild-type cells with GPR35 agonists, 
including KynA, led to colocalization of GPR35 
with mitochondrial proteins such as citrate 
synthase and the ATP synthase component 
ATP5H, but not the Golgi apparatus protein 
GM130 or the endoplasmic reticulum protein 
calreticulin, as shown by confocal microscopy 
and quantitative image analysis (Fig. 3A and 
figs. S7, A to C, and S8, A to D). This was 
specific because colocalization with mitochon- 
dria was not observed with GPR35 R151A or with 
quinolinic acid. KynA and the GPR35 agonist 
pamoic acid, but not quinolinic acid, also in- 
creased GPR35 and TOM20 association in situ 
based on proximity ligation assays (fig. S7, D to 
E). To visualize GPR35 trafficking to mitochon- 
dria, we transiently overexpressed GPR35-GFP 
together with mCherry-TOMM20 in GPR35/~ 
murine cardiomyocytes and performed live-cell 
imaging. KynA treatment caused GPR35-GFP to 
become internalized within minutes on cyto- 
plasmic punctae that associated with mCherry- 
TOMM20 (fig. S8E). To corroborate these 
findings, we rapidly isolated mitochondria 
from cells expressing 3XHA-EGFP-OMP25 
(HA-MITO) after KynA treatment and then 
performed immunoblot assays (35). We repro- 
ducibly detected the association of GPR35 
with mitochondria, and this was increased in 
cells treated with KynA or other GPR35 agonists, 
but not in cells treated with quinolinic acid (Fig. 
3B and fig. S9A). We introduced a biotin ligase 
(APEX) fused to wild-type or R151A GPR35 
into the GPR35/~ murine cardiomyocytes and 
captured GPR35-associated proteins using 
streptavidin agarose after treatment with 
biotin tyramide and hydrogen peroxide (to 


blue staining (E) and immunoblot (G) assay of the purified GPCRs used in (F). 
In (G), the proteins were preincubated with 1 mM KynA or dimethyl sulfoxide 


exposure to the indicated temperatures. (H) Confocal 
natal cardiomyocytes stably expressing wild-type or 
lated with 20 uM quinolinic acid, 20 uM KynA, or 


1 uM pamoic acid for 20 min. Scale bar, 20 um. In (A), (B), (C), and (F), 
values are shown as means + SDs for three technical replicates from one 
representative experiment. 


or absence of GPR35 agonists (fig. S9B) (36). 
Multiple mitochondrial outer membrane pro- 
teins, such as VDAC, TOM20, and TOM70, were 
associated with wild-type, but not R151A, 
GPR35 in cells treated with KynA (Fig. 3C). 
Similarly, KynA promoted biotin labeling of 
mitochondria, as determined by confocal mi- 
croscopy using streptavidin-568 as a probe, in 
cells expressing wild-type GPR35-APEX, but 
not in those expressing GPR35 R151A-APEX 
(fig. S9C) (37). 

We confirmed, using two different epitope 
tags, that epitope-tagged human and mouse 
GPR35 coimmunoprecipitated with endoge- 
nous ATPIF1 and ATP synthase, as determined 
by immunoblot analysis (fig. S10, A to D). 
Consistent with our localization studies, the 
binding of GPR35 to ATPIF1 was enhanced 
by KynA and the GPR35 agonists zaprinast 
and pamoic acid, but not by quinolinic acid 
(Fig. 4A and fig. S10A). As additional specificity 
controls, we looked for binding of ATPIF1 to 
cytosolic METAP2, the transmembrane mito- 
chondrial protein TMEM141, GPR160, ora 
GPR35 C-terminal truncation mutant (Fig. 4B 
and fig. SIOB), but found none. The electro- 
phoretic mobility of wild-type GPR35 was de- 
creased under these gel conditions compared 
with the C-terminal truncation mutant because 
of a slightly higher molecular size and because 
of glycosylation events that require the GPR35 
C terminus (38). 

ATPIF1 binds to, and regulates, the multi- 
meric ATP synthase complex (39). The ATP 
synthase subunits ATP5B, ATP5H, ATP5O, 
and ATP5F1 also coimmunoprecipitated with 
GPR35, and their abundance mirrored that of 
ATPIFI (Fig. 4, A and B, and fig. S10, A to C). 
Given that ATPIF1 and ATP synthase are lo- 
calized inside mitochondria, we tested whether 
GPR35 is present inside or outside mitochon- 
dria after treatment with KynA. We treated 
murine cardiomyocytes expressing wild-type 
GPR35 with KynA and used differential cen- 
trifugation to purify mitochondria, which were 
then exposed to increasing amounts of pro- 
teinase K. Mitochondrially associated GPR35 
was proteinase K sensitive, arguing that it is 
largely associated with the outer mitochondria 
membrane (Fig. 4C). Consistent with this, 
KynA promoted GPR35-APEX biotinylation 
of the outer mitochondrial membrane pro- 
teins VDAC and TOM70, but did not promote 


activate the APEX enzyme) in the presence 


biotinylation of ATPIF1 or ATP synthase 
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(E) Myocardial infarct size 
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Langendorff assays after g 


after in vivo cardiac I/R in 
nast, 5 mg/kg quinolinic a 


jury in wild-type mice given GPR35 agonists. Where indicated, mice were given 5 mg/kg KynA, 5 mg/kg 
cid, or vehicle by intraperitoneal injection 2 hours before the onset of ischemia. n = 4 per group. (F) LVEDP in 
obal I/R injury. Where indicated, hearts were infused with 500 nM KynA, 1 uM lodoxamide, 1 uM zaprinast, 500 nM quinolinic acid, or 


vehicle for 10 min before the onset of ischemia. n = 4 per group. (G) Immunoblot of wild-type or GPR35~’~ hIPS-derived cardiomyocytes stably expressing wild-type 


or R1S1A GPR35. (H) Fracti 


20 uM KynA, 20 uM quino 


(Fig. 4D). Therefore, KynA promotes the asso- 
ciation of GPR35 with ATPIF1 and ATP syn- 
thase, but this interaction is likely indirect. 
During ischemia, ATP synthase shifts from 
ATP synthesis to ATP hydrolysis, with loss of 
ATP eventually contributing to cell death (40). 
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Binding of ATPIF1 to ATP synthase promotes 
the formation of ATP synthase dimers and 
thereby inhibits ATP hydrolysis, oxidative 
phosphorylation, and oxygen consumption 
(41-44). ATPIF1 mimetic compounds and 
transgenic overexpression of ATPIF1 both 


onal survival after simulated I/R injury ex vivo of hIPS cell-derived cardiomyocytes modified as in (G). Cell were pretreated with 
inic acid, or DMSO for 1 hour before I/R. Data are shown as mean + SD. *P < 0.05. 


protect tissues in I/R models (44, 45). Here, 
KynA increased the recovery of ATP synthase 
after immunoprecipitation of ATPIF1 and de- 
creased the recovery of ATPIF1 after the 
immunoprecipitation of ATP synthase in a 
pertussis toxin-sensitive manner, consistent 
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Fig. 3. KynA promotes GPR35 association with mitochondria. (A) Confocal 
microscopy of GPR35’~ mouse neonatal cardiomyocytes stably expressing FLAG- 
tagged wild-type or RI51A GPR35 that were treated with 20 uM quinolinic acid, 
20 uM KynA, 1 uM pamoic acid, or DMSO for 20 min. Scale bar, 20 um. 

(B) Immunoblot analysis of whole-cell extract (WCE) or mitochondria that were 
rapidly immunopurified by anti-HA immunoprecipitation (HA-IP) from 293T cells 
engineered to contain HA-tagged mitochondria (HA-MITO) or, as a control, FLAG- 
tagged mitochondria (FLAG-MITO). Where indicated, the cells were pretreated with 
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peroxisome 
—50 


ER 


20 uM KynA or 10 uM zaprinast for 20 min before analysis. (C) GPR35“~ mouse 
cardiomyocytes stably expressing FLAG-APEX-tagged wild-type or RI51A GPR35 
were treated with biotin-tryamide for 30 min and, where indicated, with 20 uM KynA 
for 20 min and H20> for 1 min (to enable biotinylation). Cell lysates (WCE) and 
biotinylated proteins captured on streptavidin agarose (Strep-PD) were resolved by 
SDS-polyacrylamide gel electrophoresis (PAGE), transferred to nitrocellulose, and 
immunoblotted with antibodies against the indicated proteins or probed with 
horseradish peroxidase-conjugated streptavidin (Strep-HRP). 
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Fig. 4. KynA promotes the GPR35-ATPIF1-ATP synthase interaction. 

(A) Immunoblot analysis of anti-FLAG immunoprecipitates (FLAG-IP) and WCE 
from mouse neonatal GPR35°“~ cardiomyocytes stably expressing GPR160-FLAG 
or GPR35-FLAG that were treated with 20 uM quinolinic acid, 20 uM KynA, 

1 uM pamoic acid, 10 uM zaprinast, or DMSO for 20 min. (B) Immunoblot 
analysis of anti-FLAG immunoprecipitates from human AC16 cardiomyocytes 
stably expressing TMEMI141-FLAG, GPR35 (wild-type)-FLAG, or GPR35 

(AC terminus)-FLAG. (C) Immunoblot analysis of mitochondria isolated from 
mouse neonatal cardiomyocytes stably expressing GPR35-FLAG that were 
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treated with 20 uM KynA for 20 min. The isolated mitochrondria were incubated 
with increasing concentrations of proteinase K (PK) for 30 min. (D) GPR35~ 
mouse neonatal cardiomyocytes stably expressing FLAG-APEX-tagged wild-type 
or R151A GPR35 were treated with biotin-tryamide for 30 min and, where 
indicated, with 20 uM KynA for 20 min and HO. for 1 min (to enable 
biotinylation). Cell lysates (WCE) and biotinylated proteins captured on 
streptavidin agarose (STREP-PD) were resolved by SDS-PAGE, transferred to 
nitrocellulose, and immunoblotted with antibodies against the indicated proteins 
or probed with Strep-HRP. 
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Fig. 5. KynA promotes ATP synthase dimerization and ATPIF1 is required 
for KynA-induced ischemic protection. (A) Immunoblot analysis of HA-IP and 


WCE from mouse neonatal cardiomyocytes stably exp 
ATP5H-HA that were or were not treated with 20 uM 


(B) Immunoblot analysis of FLAG-IP and WCE from mouse neonatal 


cardiomyocytes stably expressing Rell2-FLAG or FLA 
were not treated with 20 uM KynA for 20 min. (C) E 
wild-type and GPR35 ~“" cardiomyocytes stably expr 
R151A GPR35-FLAG, or empty vector (EV) that were 


or DMSO for 20 min. Arrowheads indicate mitochondria. Scale bar, 500 nm. 


(D) Live-cell fluorescence imaging of mouse neonata 


with the ATPIF1:ATP synthase stoichiometry 
shifting from 1:1 to 1:2 (Fig. 5, A and B, and 
fig. S11, A to D). We independently confirmed 
that KynA promotes ATP synthase dimerization 
using native gel electrophoresis and immuno- 
fluorescence and electron microscopy, which 
revealed hallmark changes in mitochondrial 
morphology and mitochondrial cristae, respec- 
tively, usually associated with ATP synthase 
dimerization (Fig. 5, C to E, and figs. SIIE and 
$12, A to C) (46). 

Because phosphorylation of ATPIF1 by PKA 
prevents it from inhibiting ATP synthase (47), 
we tested whether GPR35 activation caused 
dephosphorylation of ATPIF1. Both KynA and 
the GPR35 agonist pamoic acid blocked ATPIF1 
phosphorylation in a GPR35-dependent man- 
ner, as shown by characteristic electrophoretic 
mobility changes after two-dimensional gel 
electrophoresis (Fig. 5F and fig. S12D). 

We used CRISPR/Cas9 to generate ATPIFI/~ 
cardiomyocytes from hIPS cells and then in- 
fected them with wild-type ATPIF1, an ATPIF1 
point mutant that cannot bind to ATP synthase 
(ATPIF1 E55A), or an empty vector (fig. S13, A 
and B) (48). Wild-type ATPIF1, but not ATPIF1 
E55A, allowed KynA to prevent ATP consump- 
tion and preserve cell viability during simulated 
ischemia ex vivo (Fig. 5, G and H). Collectively, 
these results show that KynA ischemic protec- 
tion depends on ATPIFI1. 

Although GPCRs are typically thought to 
function at the plasma membrane, there are 
multiple examples of mitochondrial GPCRs 
and G proteins (49, 50). For example, the G 
proteins guanine nucleotide binding protein-B 
subunit 2 (GB2) and guanine nucleotide bind- 
ing protein subunit alpha-12 (Ga12) have roles 
in mitochondria fission versus fusion (57, 52). 
Moreover, many GPCRs, including GPR35, are 
internalized once activated, presumably so 
that they can signal in a spatially distinct 
manner (53). The endoplasmic reticulum has 
important roles in endosomal trafficking and 
forms multiple contacts with mitochondria, 
and a physical interaction between endosomes 
and mitochondria serves critical functions 
in cell homeostasis (54-56). The endosomal 
trafficking of GPR35 to mitochondria is remi- 
niscent of Rab5, a small GTPase that trans- 


ressing Rab5B-HA or 
KynA for 20 min. 


G-ATPIF1 that were or 
ectron micrographs of 
essing GPR35-FLAG, 
treated with 20 uM KynA 


cardiomyocytes treated 


in response to oxidative stress to confer cyto- 
protection (57). 

We found that GPR35 could bind to ATPIF1 
and that the latter is necessary for cardiopro- 
tection by former. Nonetheless, GPR35 asso- 
ciates with the mitochondrial outer membrane 
in response to KynA, whereas ATPIF1 is a 
mitochondrial matrix protein (41). Moreover, 
we did not detect biotinylation of ATPIF1 or 
ATP synthase by GPR35-APEX, and neither 
ATPIF1 nor ATP synthase was shown to be 
proteinase K sensitive. Thus, the interaction of 
GPR35 with ATPIF1 is not direct, but instead it 
is likely bridged by one or more proteins, per- 
haps at contact sites that have been demon- 
strated between outer and inner mitochondrial 
membranes and endosome-mitochondrial mem- 
brane contact sites (55, 58). Pertussis toxin did 
not block the internalization of GPR35 by KynA 
but blocked the KynA-mediated regulated in- 
teraction between ATPIF1 and ATP synthase, 
ATPIF1 dephosphorylation, and cardioprotec- 
tion; the latter two therefore appear to be linked 
and to require Gi. signaling (figs. SI4A and 
S15A and Fig. 2D). We propose that GPR35, 
once activated and translocated to mitochon- 
dria, inhibits mitochondrial adenylate cyclase 
and thereby PKA. This, in turn, would allow 
ATPIF1 to promote ATP synthase dimerization 
and prevent ATP hydrolysis. Although KynA 
might have direct effects on mitochondria and 
may regulate metabolism, the effects that we 
observed clearly depend on GPR35 (59-65). 

It is clear that ischemic preconditioning in- 
volves both an early and a late phase of 
ischemic protection, and we have shown that 
GPR35 is necessary for KynA-mediated cardio- 
protection whether administered acutely or 
24 hours before I/R injury (Fig. 2A) (66). Al- 
though our data show that GPR35 is necessary 
in both settings, it remains unknown whether 
ATPIF1 is required at longer time scales. 

Ischemic diseases such as myocardial infarc- 
tion and stroke are major causes of death in 
the developed world. Remote ischemic pre- 
conditioning has been demonstrated in animal 
models, but attempts to harness it for ther- 
apeutic purposes in humans, such as through 
repeated hyperinflation of a limb blood pres- 
sure cuff before elective heart surgery, have 


locates from early endosomes to mitochondria 
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not proven to be successful (67). Without an 


with 20 uM KynA, 200 ng/ml PTX, or both before staining with 100 nM 
Mitotracker FM. Scale bar, 20 um. (E) Blue-native page analysis of mitochondria 
isolated from mouse neonatal cardiomyocytes treated with 20 uM quinolinic acid, 
20 uM KynA, or DMSO for 20 min. (F) Anti-Flag immunoblot analysis after two- 
dimensional gel electrophoresis of cell extracts made from mouse neonatal 
cardiomyocytes stably expressing FLAG-ATPIF1 that were treated with 20 uM 
KynA, 1 uM pamoic acid, or DMSO for 20 min. (G and H) Fractional survival 

(G) and ATP levels (H) of wild-type and ATPIF1’~ hIPS-derived cardiomyocytes 
stably expressing wild-type or E55A ATPIF1 or EV that were pretreated with 

20 uM KynA, 20 uM quinolinic acid, or DMSO for 1 hour before simulated 

I/R ex vivo. Data are shown as mean + SD. *P < 0.05. 


understanding of the underlying mechanism, 
however, it was impossible to know whether 
the responsible tissue-protective factor was 
properly induced by these interventions and 
if its effects would be mitigated by other fac- 
tors. Our findings support further exploration 
of KynA, and more broadly GPR35 agonists, 
for the prevention and treatment of ischemic 
injuries. A number of chemically synthetic 
GPR35 agonists have now been described, 
including compounds with far greater potency 
than KynA (20, 21). In addition, our findings 
provide a unifying explanation for the tissue- 
protective effects of kynurenine hydroxylase 
and kyurenine monooxygenase inhibitors. 
Although these inhibitors were originally 
developed to prevent the production of neuro- 
excitatory and neurotoxic tryptophan metabo- 
lites, they also promote the accumulation of 


KynA (68, 69). 
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Meiotic exit in Arabidopsis is driven by 
P-body-mediated inhibition of translation 


Albert Cairo’, Anna Vargova', Neha Shukla’, Claudio Capitao’, Pavlina Mikulkova’, Sona Valuchova', 


Jana Pecinkova’, Petra Bulankova"+, Karel Riha’* 


Meiosis, at the transition between diploid and haploid life cycle phases, is accompanied by reprograming of cell 
division machinery and followed by a transition back to mitosis. We show that, in Arabidopsis, this transition 
is driven by inhibition of translation, achieved by a mechanism that involves processing bodies (P-bodies). 
During the second meiotic division, the meiosis-specific protein THREE-DIVISION MUTANT 1 (TDM1) is 
incorporated into P-bodies through interaction with SUPPRESSOR WITH MORPHOGENETIC EFFECTS ON 
GENITALIA 7 (SMG7). TDM1 attracts elF4F, the main translation initiation complex, temporarily sequestering it 
in P-bodies and inhibiting translation. The failure of tdm1 mutants to terminate meiosis can be overcome by 
chemical inhibition of translation. We propose that TDM1-containing P-bodies down-regulate expression of 
meiotic transcripts to facilitate transition of cell fates to postmeiotic gametophyte differentiation. 


ifferentiation of plant gametes is a multi- 
step process that involves meiosis, a 
reductional cell division that produces 
haploid cells from diploid progenitors. 
Meiotic chromosome segregation re- 
quires modification of the cell division pro- 
gram that is implemented through ordered 
expression of specialized proteins, reflected 
in transcriptional changes (/, 2). The dura- 
tion of meiosis can range from half a day to 
decades, as in the case of human oogenesis, 
and the prolonged prophase I, during which 
chromosomes are condensed, limits instruc- 
tive transcription. Therefore, the activities of 
many meiotic genes are regulated by post- 
transcriptional and translational mechanisms 
(3, 4). Posttranscriptional gene regulation has 
been linked to biomolecular condensates. 
Studies in yeast indicate such condensates in 
meiosis, in which late meiotic genes are tran- 
slationally repressed in meiosis I by inclusion 
in amyloid-like aggregates. These aggregates 
dissolve during meiosis II, releasing transla- 
tional inhibition and allowing synthesis of 
proteins required for meiotic exit (5, 6). 
In animals, haploid products of meiosis 
directly differentiate into gametes. In plants, 
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fungi, and some protists, however, haploid 
spores undergo mitotic divisions to form 
haploid free-living organisms or reproductive 
structures. In such haploid cell lineages, com- 
pletion of meiosis must be followed by repro- 
graming of cell division machinery and a 
transition back to mitosis. 

In angiosperm plants, differentiation of 
gametes includes the formation of pollen and 
the embryo sac, rudimentary multicellular 
gametophytes generated by postmeiotic (i.e., 
mitotic) divisions of haploid cells. In Arabidopsis, 
meiotic exit and the diploid-to-haploid tran- 
sition require the evolutionarily conserved 
nonsense-mediated mRNA decay (NMD) fac- 
tor SUPPRESSOR WITH MORPHOGENETIC 
EFFECTS ON GENITALIA 7 (SMG7), together 
with the plant-specific protein THREE-DIVISION 
MUTANT 1 (TDM1) (7-9). Mutants deficient in 
these proteins fail to terminate male meiosis 
and do not form microspores. Instead, haploid 
nuclei produced in meiosis undergo multiple 
cycles of spindle reassembly and chromatin 
recondensation without intervening DNA syn- 
thesis, resulting in aborted polyads (8, 10, 11). 
Meiotic exit is driven by TDM1. Its unsched- 
uled activation in meiosis I results in prema- 
ture termination of meiosis and formation of 
diploid microspores (9), but the molecular 
mechanism by which TDM1 terminates meio- 
sis is unknown. Here we demonstrate that, in 
Arabidopsis, meiotic exit and the diploid-to- 
haploid transition are mediated by inhibi- 
tion of translation through temporally limited 
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Fig. 1. SMG7-dependent A 
localization of TDM1 in 

P-bodies. (A) Localization 

pattern of TDM1 in PMCs from 

tdm1-4 plants complemented 

with pTDMI1::TDMLYFP. DNA 
was counterstained with 
4',6-diamidino-2-phenylindo 
(DAPI). (B) Colocalization 
of DCP1 (35S::DCP1:YFP) and 
SMG7 (35S::YFP:SMG7) with 
TDMI1 (35S::TagRFP:TDM1) 
in mesophyll protoplasts 
6 hours after cotransfection. 
(C) TDM1-SMG7 interaction 
assessed by BiFC assay 
in protoplasts. The TDM1 
paralog At3G51280 was used 
as a specificity control. 
Representative images from 
at least three independent 
transfections are shown. In 
(B) and (C), autofluorescence 
from chloroplasts is indicated 
in gray. (D) Colocalization 

of SMG7 and TDM1 in 
telophase II PMCs expressing 
PpSMG7::SMG7:TagRFP. and 
pTDMI1::TDML:YFP. DNA was 
counterstained with DAPI. 

(E) Meiotic localization of 
SMG7 in PMCs from smg7-1 
mutants complemented 

with pSMG7::SMG7:TagRFP. 
DNA was counterstained with 
DAPI. (F) Localization of TDM1 
at the stages corresponding 
to telophase II in the indicated 
smg7 mutants expressing 
pTDMI1::TDMLYFP. Scale bars: 
5 um in (A) and (D); 10 wm 
in (B), (C), (E), and (F). 


TDM1:YFP 


e 


DAPI 


sequestration of translation initiation complexes 
by TDM1 in processing bodies (P-bodies). 


TDM1 is recruited to P-bodies by SMG7 


TDM1 is a meiosis-specific protein expressed 
from late prophase I, as evidenced by re- 
porter constructs and protein immunodetec- 
tion (9, 12). To determine the localization of 
TDM1 during meiosis, we generated pTDMI:: 
TDMI:YFP and pTDMI1::TagRFP:TDM1 reporter 
lines (fig. S1). Whereas TDM1 shows mostly 
uniform cytoplasmic distribution in pollen 
mother cells (PMCs) during meiosis I, it forms 
prominent cytoplasmic foci in meiosis II that 
decline after tetrad formation (Fig. 1A, fig. $2, 
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SMG7 / smg7-1 


anaphase | 


and movie S1). These foci are reminiscent of 
P-bodies, cytoplasmic condensates generated 
by liquid-liquid phase separation (LLPS) that 
predominantly consist of RNA decay proteins 
(13). We confirmed that these foci were 
P-bodies by colocalizing ectopically expressed 
TDM1:TagRFP in Arabidopsis mesophyll pro- 
toplasts with DECAPPING PROTEIN 1 (DCP), 
an mRNA decapping factor, and SMG7 (Fig. 1B 
and figs. S3 and S4), both of which are core 
components of P-bodies (/4). Formation of 
LLPS condensates depends on intracellular 
protein concentration. To test that the ob- 
served signal distribution was not caused by 
ectopic protein overexpression, we confirmed 


interkinesis metaphase ll anaphase Il 


telophase Il __ tetrad 


TDM1:cYFP + 
nYFP:SMG7 
(K7TTE R185E) 


TOM1:cYFP + 
nYFP:SMG7 At3G51280:cYFP + 
(4702-1059) nYFP:SMG7 


the colocalization of SMG7 and TDM1 at 
meiotic telophase II in PMCs stably express- 
ing the respective reporter constructs from 
native promoters (Fig. 1D and fig. S1). We 
found that P-bodies marked with SMG7: 
TagRFP are present throughout all stages of 
meiosis, not just in meiosis II (Fig. 1E). This 
indicates that TDM1 does not elicit P-body 
formation specifically in meiosis II but that 
it is recruited to preexisting P-bodies. 

tdm1 is genetically epistatic to smg7 with 
respect to the meiotic exit phenotype (8, 1D), 
which, together with the colocalization in 
P-bodies, suggests a mechanistic link between 
these proteins. SMG7 is a phosphoserine binding 
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protein that has so far been implicated in only 
NMD (J5). In NMD, SMG7 binds phosphory- 
lated UP-FRAMESHIFT 1 (UPF1) that marks 
mRNAs containing premature termination 
codon, and mediates their recruitment to 
P-bodies (6, 17). However, the meiotic function 
of SMG7 is independent of NMD (/8, 79). We 
therefore hypothesized that, analogous to its 
function in NMD, SMG7 binds and recruits 
TDM1 to P-bodies. 

We observed SMG7 interaction with TDM1 
in P-bodies by bimolecular fluorescence com- 
plementation (BiFC) in mesophyll protoplasts 
(Fig. 1C). SMG7 has a tripartite structure that 
consists of an N-terminal 14-3-3 like domain, a 
central helical domain, and an evolutionarily 
diverged C terminus (15) (fig. S4A). A combi- 
nation of two mutations in conserved residues 
of the 14-3-3 phosphoserine binding pocket, 
Lys’’—Glu (K77E) and Arg"®’>Glu (R185E) 
(15, 17), abolished the SMG7-TDM1 interaction 
(Fig. 1C). P-bodies are archetypes of cytoplas- 
mic LLPS biocondensates, whose formation is 
driven by multivalent interactions that often 
include proteins with intrinsically disordered 
regions (20). We found that the C terminus 
of SMG7 is highly disordered and contains 
a prion-like domain enriched in polar amino 
acids that contribute to LLPS (27) (figs. S4 
and S5). Deletion of the C terminus or the 
prion-like domain decreased localization of 
SMG7 to P-bodies but did not abolish the 
SMG7-TDM1 interaction (Fig. 1C and fig. S4B). 
Together, these data suggest that SMG7 acts as 
an adaptor protein that binds TDM1 at its 14- 
3-3 domain and recruits it to P-bodies through 
the intrinsically disordered region in its C ter- 
minus. Indeed, TDM1 fails to form cytoplas- 
mic foci in PMCs of Arabidopsis smg7-1 null 
mutants, and its association with P-bodies is 
reduced in smg7-6 plants, which have a trun- 
cation in the SMG7 prion-like domain (Fig. 1F 
and fig. S4A). 


TDM1 mediates retention of elF4F in P-bodies 


We used two approaches to identify addi- 
tional genes involved in meiotic exit. First, we 
performed a genetic suppressor screen for 
mutations that improve pollen formation and 
fertility of semifertile smg7-6 plants (77). Two 
suppressor lines, EMS43 and EMS98, harbored 
mutations in the At2G24050 gene, which en- 
codes the eukaryotic translation initiation 
factor eIlFiso4G2 (Fig. 2A, fig. S6, and data 
Sl and S2). The causality of these mutations 
was validated by allelic test and genetic com- 
plementation (fig. $7). A portion of PMCs 
in EMS43 and EMS98 lines formed tetrads 
(fig. S8), indicating that down-regulation of 
elFiso4G2 facilitates meiotic exit in SMG7- 
deficient plants. 

In the second approach, we used a yeast two- 
hybrid assay to screen Arabidopsis cDNA libraries 
for TDM1-interacting proteins. The Arabidopsis 
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Fig. 2. Genetic and biochemical interactions between elF4Gs and SMG7/TDML. (A) Alexander staining 
of anthers and (B) quantification of pollen count per anther in EMS98 and EMS43 lines (n = 8 anthers). 
Significance of the difference is indicated (two-tailed t test; ***P < 0.005; ****P < 0.0001). Scale bar in (A): 
100 um. (C) TDMI1- elF4Gs interactions assessed by yeast two-hybrid assay. A truncated elF4G protein 
composed of 819 to 1529 amino acids and the full-length elFiso4G1 and elFiso4G2 were used in the assay. 
Dilution series of yeast cells containing pGBKT7 (bait) and pGADT7 (pray) plasmids were spotted on 
synthetic media lacking either Leu and Trp (SD-LW) or Leu, Trp, and His and were supplemented with 1 mM 
3-amino-1,2,3-triazole (SD-LWH 1mM AT). (D) Interaction between TDM1 and elF4Gs assessed by BiFC in 
protoplasts. TDM1 paralog At3G51280 and elFis4G2 lacking the central HEAT1 domain were used as a 
negative control. Chloroplast autofluorescence is indicated in gray. Scale bar: 10 um. 


translation initiation factors eIFiso4G2 and its 
homolog eIFiso4G1 were among the top hits 
(data S3). eIF4G proteins form a scaffold of the 
main eukaryotic translation initiation complex 
eIF4F, and Arabidopsis contains three eIF4G 


paralogs—canonical eIF4G and its shortened 
isoforms eIFiso4G1 and elFiso4G2 (22), which 
we collectively refer to as eIF4Gs. The interaction 
between eIFiso4G2 and TDM1 was validated 
by in vitro coimmunoprecipitation (fig. S9). 
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YFP:elF4G 


+TDM1:RFP 


YFP:elFAG TDM 1:RFP 


YFP:elFiso4G1 | YFP-elFiso4G1 TDM1-RFP 


YFP elFiso4G2 


TOM1-RFP 


e@lF4A1:CFP + TOM1:RFP 


elF4A1-CFP TDOM1:RFP 


elF4A1_CFP TDM1:RFP 


Fig. 3. TDM1 recruits elF4Gs to P-bodies. Colocalization of (A) elF4Gs 
(35S::YFP-elF4Gs) and TDM1 (35S::TDML-TagRFP) and (B) elF4A1 (35S::elF4A1:CFP), 
elFiso4G2 (35S::YFP:elFiso4G2), and TDM1 (35S::TDML:TagRFP) in mesophyll 
protoplasts. (C) Quantification of the YFP:elF4Gs and elF4A1:CFP signals in 
P-bodies from (A) and (B) from total signal in a cell. Error bars indicate SD (n = 5 
to 7). Significance of the difference between sample and respective control 
without TDM1 is indicated (two tailed t test; *P < 0.05; **P < 0.01; ***P < 0.005; 
****P < (0001). (D) Colocalization of TDM1 and elFiso4G2 in telophase II 


Pairwise yeast two-hybrid assays showed that 
TDM1 binds to all three paralogs (Fig. 2C), 
though the interactions appear to be weaker 
than between the eIF4Gs and another subunit 
of the eIF4F complex, the eIF4A1 helicase (fig. 
S9B). The TDM1-elFiso4G2 interaction is me- 
diated by the C-terminal portion of TDM1 and 
the central region of eIFiso4G2 encompassing 
the HEAT] domain (fig. S9, B to D). 

BiFC assays showed that the interaction 
between TDM1 and eIF4G factors takes place 
in P-bodies (Fig. 2D). However, the eIF4F com- 
plex is not a constituent of P-bodies (14, 23), 
and Arabidopsis eIF4G paralogs are uniformly 
localized in the cytoplasm of mesophyll cells 
(Fig. 3A). Because TDM1 is normally expressed 
only in meiocytes and absent in mesophyll cells, 
we tested whether the association of eIF4Gs 
with P-bodies is caused by TDM1. Indeed, co- 
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expression of TDM1 in mesophyll protoplasts 
led to relocalization of eIF4Gs into P-bodies, 
whereas they remained in cytoplasm when 
coexpressed with SMG7 (Fig. 3, A and C, and 
fig. S10). We next assessed whether localiza- 
tion of the eIF4A1 helicase is also affected by 
TDM1. Although we observed only a small por- 
tion of eIF4A1 in P-bodies when coexpressed 
with TDM1, this portion is increased as much 
as ninefold when TDM1 was coexpressed with 
individual eIF4Gs (Fig. 3, B and C, and fig. S11). 
These data suggest that TDMI recruits eIF4.A1, 
and likely the entire eIF4F complex, into 
P-bodies through interaction with its eIF4G 
scaffold subunits. 

The Arabidopsis eIF4Gs act redundantly, 
although they exhibit distinct functional spe- 
cificities in the translation of different tran- 
scripts (22, 24). C-terminally-tagged gene-YFP 


TagRFP-TDM1 


elFiso4G2:YFP 


telophase II tetrad 


1-1 6 1+ ° 1 


distance from the centre of TDM foci (um) 


7 PATS, 901-4 
e* ore. 


PMCs expressing pelFiso4G2::elFiso4G2:YFP and pTDMI1::TagRFP:TDMI1. Arrow- 
heads indicate signal in cytoplasmic foci. The signal in granules represents 

1.2 + 0.2% of the total signal (n = 8). (E) Signal intensity profiles of TDM1 (red) 
and elFiso4G2 (yellow) centered at P-bodies during meiosis Il. Shaded areas 
around lines indicate SDs (n = 8 to 9). (F) Representative pictures of elFiso4G2 
localization in telophase II PMCs from wild-type and tdm1-4 mutants (five 
anthers were scored for each genotype). Arrowheads indicate eiFiso4G2 foci. 


reporters expressed from native promoters re- 
vealed that all eIF4Gs are expressed in PMCs 
throughout meiosis, with eIFiso4G2 being the 
most abundant (fig. S12). Whereas eIF4Gs are 
mainly distributed throughout the cytoplasm, 
eIF4G and elFiso4G2 exhibit a more dynamic 
pattern, associating with the spindle in ana- 
phase I and granules during meiosis IT (Fig. 
3D and fig. $12). The eIFiso4G2 granules colo- 
calize with TDM1, indicating that they repre- 
sent P-bodies, and a quantitative analysis of 
the signal revealed a peak of TDM1-elFiso4G2 
association in telophase II (Fig. 3, D and E, and 
fig. S13). The granules are TDM1 dependent, as 
they do not form in ¢dmJ-null mutants (Fig. 3F). 
These data validate our observations in meso- 
phyll protoplasts and demonstrate the TDM1- 
dependent recruitment of eIFiso4G2 (and likely 
of eIF4G) to P-bodies during telophase II. 
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Fig. 4. Inhibition of translation 
drives meiotic exit. (A) FRAP of 
TDMLYFP, TDMI1:TagRFP, and 
elFiso4G2:TagRFP signals in 
P-bodies. A line harboring 
pTDMI1::TDMLYFP was used 

to measure FRAP in PMCs, 
whereas transient coexpression 
of either 35S::TDMI1:TagRFP 

with 35S::YFP-elFiso4G2 or 
35S::elFiso4G2:TagRFP with 
35S::TDMLYFP was performed to 
measure FRAP in protoplasts. 
Charts depict quantification of 
signal recovery in PMCs and 
protoplasts over 180 s after 
photobleaching. Error bars 
indicate SDs from three 
measurements. Scale bar: 1 um. 
(B) Micrographs of TDMLYFP 
aggregates formed in vitro 30 min 
after the addition of tobacco etch 
virus (TEV) protease to remove 
the His-MBP solubility tag by 
TEV protease. Scale bar: 5 um. 
(C) TDM1 multimerization 
detected by in vitro protein 
immunoprecipitation (IP). IP was 
performed using a-GFP nanobody, 
and proteins were detected by 
Western blotting with an o-His 
antibody. (D) Time-lapse series of 
wild-type and tdml-4 PMCs 
expressing PRPS5SA::G3GFP:TUA5 
and pHTA10::HTA10:TagRFP for 
microtubule (magenta) and 
chromatin (violet) labeling, 
respectively. Time in minutes 

is indicated, with diakinesis set 
as time point zero; meiosis | 

is depicted for only the wild type. 
(E) Time-lapse series of PMCs 
treated with CHX at time point 0. 
Time in minutes relative to the 
CHX treatment is indicated. 
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In contrast to mesophyll protoplasts, where 
TDM1 was ectopically expressed from a strong 
promoter, only a fraction (1.2%) of the eIFiso4G2 
in PMCs is localized to cytoplasmic granules 
(Fig. 3D). This is consistent with the observa- 
tion that the expression level of TDM1 in PMCs 
is lower than for any of the eIF4Gs (fig. S12B). 


Meiotic exit is driven by inhibition 

of translation 

Biocondensates formed by LLPS, such as 
P-bodies, typically display a dynamic exchange 
of their protein constituents with the sur- 
rounding environment (20). Notably, fluores- 
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cence recovery after photobleaching (FRAP) 
showed no recovery of TDM1:TagRFP signal 
in P-bodies in either PMCs or mesophyll pro- 
toplasts (Fig. 4A), and this behavior was also 
recapitulated in TDM1:YFP condensates formed 
in vitro (fig. $14). TDM1 has a tendency to 
aggregate (Fig. 4B), and it multimerizes when 
kept soluble by fusion with the maltose bind- 
ing protein (Fig. 4C). These data indicate that 
TDM1 likely forms multimeric structures that 
are rigidly embedded within P-bodies. By con- 
trast, eIFiso4G2 exhibits a more dynamic as- 
sociation with TDM1-containing P-bodies, as 
its FRAP signal rapidly recovers (Fig. 4A). 


His-MBP-TDM1-YFP 


His-MBP-TDM1-YFP + - + 


His-MBP-TDM1 - + + 


IP aGFP 


Thus, despite its specific interaction with 
TDM1, eIFiso4G2 can rapidly shuttle between 
P-bodies and the cytoplasm. 

These observations suggest that, during 
meiosis II, TDM1 bestows P-bodies with 
the capability to temporarily recruit eIF4F 
complexes—and, by this mechanism, to ter- 
minate meiosis and enable the transition to 
gametophytic development. Because P-bodies 
are associated with translational repression and 
because ezfiso#G2 mutations reverse the inabil- 
ity of smg7-6 plants to exit meiosis (Fig. 2, A 
and B), we hypothesized that termination of 
meiosis is implemented through inhibition of 
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translation. In support of this, TDM1 impaired 
in vitro translation in wheat germ extract (fig. 
S15A) and suppressed expression of TCS::GFP 
cytokinin inducible reporter in protoplasts 
(fig. $15, B and C). To test whether TDM1 
facilitates meiotic exit through translational 
repression, we assessed the impact of cyclo- 
heximide (CHX), an inhibitor of eukaryotic 
translation, on meiotic progression in tdm1 
mutants. Live-cell imaging of meiosis, using a 
GFP:TUAS reporter to visualize microtubules 
and HTA10:TagRFP to visualize chromatin (25), 
showed that cytokinesis and tetrad formation 
normally occur ~4 hours after telophase II in 
controls as well as in PMCs treated with CHX 
at telophase II (movies S2 and S4 and Fig. 4, D 
and E). TDM1-null plants failed to undergo 
cytokinesis and instead entered multiple cycles 
of chromatin condensation-decondensation 
and spindle assembly-disassembly (movie $3 
and Fig. 4D). By contrast, treatment of tdm1 
PMCs with CHX at telophase II prevented 
these postmeiotic cycles of aberrant chromo- 
some segregation and induced orderly cyto- 
kinesis and tetrad formation in all scored 
meiocytes (movie S5 and Fig. 4E; 87 PMCs 
scored). TDM1 is inhibited during meiosis I by 
phosphorylation with CDKA;1-TAM. Its un- 
scheduled activation, either by a phospho-site 
mutation or inactivation of the cyclin TAM, 
leads to cytokinesis after telophase I and for- 
mation of dyads and diploid microspores (9). 
Application of CHX in wild-type PMCs during 
metaphase or anaphase I led to premature 
cytokinesis and dyad formation, recapitulat- 
ing the ectopic activation of TDM1 in all 81 
scored meiocytes (movie S6 and Fig. 4E). 
These observations show that TDM1 mediates 
meiotic exit by inhibiting translation. 


Discussion 


In this study, we dissected the molecular func- 
tion of SMG7 and TDM1 in meiotic exit in 
Arabidopsis. We demonstrate that meiotic exit 
is driven by inhibition of translation through 
TDM1-mediated recruitment of translation in- 
itiation factors to P-bodies. Despite their tight 
association, TDM1 does not have an intrinsic 
affinity for P-bodies and requires SMG7 for 
its incorporation. TDM1 is therefore another 
client of SMG7, in addition to the known in- 
teracting partner UPFI1, implying a broader 
role for SMG7 in P-body metabolism and re- 
modeling beyond its role in NMD. TDMI1 ap- 
pears to act as a scaffold protein that, through 
its specific interaction with eIF4Gs, transiently 
recruits the main eukaryotic translation initi- 
ation factor eIF4F to P-bodies. Meiotic exit in 
angiosperm plants marks the cell fate transi- 
tion, which coincides with the remodeling of 
cell division machinery as well as the alternation 
between genetically distinct sporophyte and 
gametophyte life forms. This is expected to be 
accompanied by the clearance of transcripts 
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specifying the preceding state. Thus, it is likely 
that recruitment of eIF4F to P-bodies leads not 
only to translational repression of associated 
mRNAs but perhaps also to their degradation. 

Although P-bodies were identified more than 
two decades ago, their biological function is 
still subject to debate. P-bodies associate with 
RNA decay proteins, but the prevalent view 
is that they mainly act as sites of transla- 
tional repression and RNA storage (26, 27). 
Nevertheless, some studies suggest that they 
function in selective RNA decay (28, 29). We 
propose that the interaction between TDM1, 
embedded within P-bodies, and eIF4F leads to 
disassembly of the translation initiation com- 
plex and exposure of the mRNA 5’ end to de- 
capping proteins. The TDM1 binding site of 
eIFiso4G2 spans the HEAT1 domain (fig. S8), 
which is required for the interaction between 
poly-A binding protein PABP and eIF4G. This 
interaction tethers the 3’ and 5’ ends of mRNA 
and facilitates reinitiation of translation (30). 
Disruption of the eIFiso4G2-PABP interaction 
by TDM1 is expected to impair reinitiation of 
translation and weaken eIF4F binding to the 
5'’-mRNA cap (31). Together with the high 
local concentration of decapping factors in 
P-bodies, this may create a permissive envi- 
ronment for replacement of eIF4F with the 
decapping complex, priming mRNAs for deg- 
radation. This possibility is consistent with 
the observation that whereas TDM1 is tightly 
bound to P-bodies, eIFiso4G shows a more 
dynamic association. In this model, P-bodies 
with embedded TDM1 may be perceived as 
catalysts that mediate the disassembly and 
turnover of RNA-bound elF4F (fig. S16). It is 
currently unknown how many transcripts are 
affected by this mechanism. On one hand, 
comparative proteomics analysis reported 
down-regulation of ~25% of detectable pro- 
teins between tetrads and microspores, sug- 
gesting a pronounced alteration of proteome at 
the diploid-to-haploid transition (32). On the 
other hand, single-cell transcriptomics in maize 
showed persistence of sporophytic mRNAs 
until pollen mitosis I (32, 33), suggesting that 
P-body-mediated inhibition of translation 
does not have a global effect on transcrip- 
tomes but may exhibit specificity toward a 
subset of transcripts. This is supported by the 
observation that only 1.2% of cellular eIFiso4G2 
is retained in P-bodies at a given time point. 
Nevertheless, considering the dynamic nature 
of this association with 50% signal recovery 
within ~90 s (Fig. 4A), a considerable portion 
of eIFiso4G2-associated mRNA can be shut- 
tled through P-bodies during telophase II, 
leading to remodeling of the translatome and 
hence facilitating a cell fate transition. 
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HYBRIDIZATION 


Selection against admixture and gene regulatory 
divergence in a long-term primate field study 
Tauras P. Vilgalys'*+, Arielle S. Fogel"*+, Jordan A. Anderson’, Raphael S. Mututua‘, 


J. Kinyua Warutere*, |. Long’ida Siodi*, Sang Yoon Kim’, Tawni N. Voyles!, Jacqueline A. Robinson®, 
Jeffrey D. Wall®, Elizabeth A. Archie®, Susan C. Alberts’”®, Jenny Tung?”®210* 


Genetic admixture is central to primate evolution. We combined 50 years of field observations of 
immigration and group demography with genomic data from ~9 generations of hybrid baboons 

to investigate the consequences of admixture in the wild. Despite no obvious fitness costs to hybrids, 
we found signatures of selection against admixture similar to those described for archaic hominins. 
These patterns were concentrated near genes where ancestry is strongly associated with gene 
expression. Our analyses also show that introgression is partially predictable across the genome. This 
study demonstrates the value of integrating genomic and field data for revealing how “genomic 
signatures of selection” (e.g., reduced introgression in low-recombination regions) manifest in nature; 
moreover, it underscores the importance of other primates as living models for human evolution. 


he ancestors of modern humans inter- 

mixed with Neanderthals and other close, 

now-extinct lineages, leaving a genetic 

legacy that continues to shape human 

trait variation today (7-3). Even as these 
findings reshape our conception of human 
origins, they also bring us more closely in line 
with our primate relatives, where hybridiza- 
tion is observed in many species (4, 5). Studies 
of other living primates therefore provide con- 
text for understanding admixture dynamics in 
our own lineage. Field studies in hybrid zones, 
for instance, offer the opportunity to inte- 
grate demographic (e.g., reproductive success, 
immigration/emigration), phenotypic, and ge- 
nomic data on early-generation hybrids, which 
studies in humans suggest experienced the 
greatest fitness costs (6, 7). 

Thus far, studies suggest that ancestry fre- 
quently predicts trait variation in primate hybrid 
zones, but admixture often does not result in 
overt fitness costs (8-11). However, field obser- 
vations have not been combined with popu- 
lation and functional genomic analyses to 
investigate both the organismal and molec- 
ular consequences of admixture in primates. 
Here, we took such an approach to investi- 
gate whether selection against introgression 
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(ie., alleles introduced by gene flow from one 
distinct lineage to another) is compatible with 
apparently healthy hybrids, investigated the 
functional consequences of introgressed alleles, 
and followed the course of hybridization and 
natural selection across generations. 

We focused on admixture between yellow 
baboons (Papio cynocephalus) and anubis 
baboons (P. anubis): large-bodied, terrestrial 
primates long used as a model for human 
biology and evolution (12). Although baboon 
taxonomy has undergone many revisions over 
time, six extant baboon species are currently 
recognized on the basis of distinct phenotypic 
differences and a pattern of phylogenetic di- 
vergence supported by recent whole-genome 
sequencing data (12-14). This phylogeny estab- 
lishes two major baboon lineages (the “northern” 
and “southern” clades) that separated ~1.4 mil- 
lion years ago, although the complex evolution- 
ary history of baboons means that they may 
have experienced episodes of gene flow since 
that time (74-16). Anubis and yellow baboons 
belong to the northern and southern clades, 
respectively, and both phylogenetic and pop- 
ulation genomic analyses confirm their diver- 
gence into distinct taxa (13, 14). Nonetheless, 
they interbreed to produce viable and fertile 
offspring where their ranges meet (Fig. 1A). 

We concentrated on the region in and around 
the Amboseli basin of Kenya, where data from 
50 years of continuous observation of a popula- 
tion near the center of the hybrid zone are 
available (17). Members of this majority-yellow 
baboon population include descendants of 
historical admixture prior to the start of moni- 
toring in 1971, as well as descendants of a 
directly observed, recent wave of admixture 
beginning in 1982 (15, 78). In Amboseli, hybrids 
do not experience obvious fitness costs, and 
anubis ancestry may in fact confer benefits, 
including accelerated maturation, increased 
mating success, and higher rates of male- 


female affiliation (19-21). However, field and 
microsatellite data indicate that the hybrid zone 
is narrow (22), which suggests that natural se- 
lection may act to limit gene flow. 


Structure of the baboon hybrid zone 


To assess selection against introgression in 
hybrid anubis-yellow baboons, we used whole- 
genome resequencing data to evaluate ances- 
try patterns for animals sampled in and near 
the Amboseli hybrid zone (Fig. 1 and table S1). 
We generated resequencing data from 430 wild 
baboons from Amboseli and Mikumi National 
Park in Tanzania [17 high-coverage (mean = 
22.51x); 413 low-coverage (mean = 1.04x)], 
which we combined with published baboon 
genomes from Amboseli (7 = 22), Mikumi (7 = 
5), the Maasai Mara National Reserve (7 = 7), 
and the Aberdares region of central Kenya (7 = 
2) 4, 15, 23). In Amboseli, our sample included 
442 baboons born between 1969 and 2016. 
Finally, we also included the genomes of 39 
captive baboons from the Southwest National 
Primate Research Center (SNPRC; 7 = 31 col- 
ony founders, 33 total) and the Washington 
National Primate Research Center (WNPRC; 
n = 6) (14, 15, 24). 

We estimated global and local ancestry for 
each individual using a composite likelihood 
method suitable for low-coverage data, LCLAE, 
which uses genotype likelihoods across genomic 
windows rather than requiring genotypes 
at specific variants (73, 15). These results con- 
firmed that admixture is minimal or absent 
in the anubis baboon founders of the SNPRC 
colony, anubis baboons from Maasai Mara, 
and yellow baboons from Mikumi (Fig. 1 and 
fig. S5), although we cannot exclude ancient 
bouts of admixture that affect all living baboons. 
In contrast, all baboons from Amboseli are 
admixed [Fig. 1A; mean = 30 to 37% genome- 
wide anubis ancestry + 10% SD), including 
many whose ancestry can be traced to anubis- 
like immigrants within the most recent seven 
generations. These results closely match F,-ratio 
estimates (25) (<2% difference for nine high- 
coverage Amboseli genomes), indicating that 
putative anubis ancestry in Amboseli reflects 
admixture, not incomplete lineage sorting (73). 

We also detected a signal of ~17% mean 
anubis ancestry in the putatively yellow 
baboon founders of the SNPRC colony, which 
were previously thought to be unadmixed 
(Fig. 1) (24). Identity-by-descent (IBD) analysis 
using IBDMix (26) confirmed this pattern 
(Fig. 1C). Because IBD between Mikumi yellow 
baboons and anubis baboons is <5%, these 
findings also implicate admixture rather than 
incomplete lineage sorting (Fig. 1C) (13). Com- 
bined with evidence for yellow ancestry in a 
central Kenyan anubis baboon (13, 14), our 
results indicate that gene flow has been a 
common feature of baboon evolution in east 
Africa. 
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Fig. 1. The structure of the baboon hybrid zone in Amboseli and the 
surrounding region. (A) Geographic locations and local ancestry estimates 
for baboons in this study (black asterisk = Amboseli). For each population, 


each row corresponds to the first 20 Mb of chromosome 1 for one individual, 


organized vertically by global ancestry. For Amboseli, a subsample of 
100 individuals is shown. Central map: Ranges of yellow baboons and 
anubis baboons in Kenya and Tanzania. Small map: Ranges of all 

six African baboons (47), modified from a map by Kenneth Chiou (CC BY 
3.0 license). [Baboon drawings by Christopher Smith] (B) Principal 
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population 


components analysis (PCA) of genotype data for high-coverage genomes. 
Inset: Distribution of “yellow-like” individuals along PC1. SNPRC 

yellow baboon founders resemble admixed Amboseli baboons. (€) IBDMix 
(26) results for three sets of yellow or majority-yellow baboons. SNPRC 
yellow baboon founders and Amboseli baboons exhibit substantial 
identity by descent (IBD) with anubis baboons, whereas IBD estimates 
for Mikumi yellow and anubis baboons are low. The excess IBD in the 
SNPRC and Amboseli samples points to the contribution of gene 


flow beyond residual incomplete lineage sorting. 
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Selection against introgression in Amboseli 
To investigate whether selection restricts gene 
flow between anubis and yellow baboons, we 
focused on the multigenerational dataset from 
Amboseli. We replicated three analyses used 
to infer selection against Neanderthal or 
Denisovan introgression in humans (27-29). 
First, we tested for a relationship between 
anubis ancestry in Amboseli and yellow-anubis 
genetic divergence [based on unadmixed pop- 
ulations (13)]. Because the Amboseli population 
is largely of yellow baboon origin, if hybrid- 
ization is deleterious, selection is expected to 
be less permissive of anubis alleles that are 
more diverged from their yellow counterparts. 
Indeed, we found that anubis ancestry is sys- 
tematically lower in regions of the genome 
with more fixed differences (Fig. 2A; Spearman’s 
rho = -0.119, P = 8.05 x 10°“). In Amboseli, 
anubis alleles were 6.7% less common in the 
most diverged percentile relative to the least 
diverged percentile of the baboon genome. 
These results are similar to the negative 
correlation between the density of fixed 
human-Neanderthal differences and introgressed 
Neanderthal ancestry in modern humans (27) 
(Fig. 2B and table S2). 

Second, we tested whether introgressed 
anubis ancestry is depleted in genomic regions 
that are likely to be affected by linked selection, 
as summarized by B statistic values calculated 
for the baboon genome (13, 30) (i.e., due to high 
gene density per recombination distance). Again 
paralleling the case of Neanderthal ancestry in 
modern humans (28), anubis ancestry was most 
common in regions that were predicted to be 
least affected by linked selection (Fig. 2, C and 
D; Spearman’s rho = 0.168, P = 1.73 x 10°°). 
Consequently, anubis ancestry per individual 
was reduced, on average, by 7.03% in protein- 
coding regions relative to random, size-matched 
regions of the genome (+4.20% SD; n = 442 
Amboseli baboons). Reductions in promoters 
and putative peripheral blood mononuclear 
cell enhancers were 5.56% (+4.10% SD) and 
6.22% (+4.20%), respectively. 

Third, we tested whether introgressed anubis 
ancestry is positively correlated with local 
recombination rate. This relationship is pre- 
dicted if recombination influences the rate at 
which natural selection eliminates deleterious 
introgressed ancestry and uncouples deleteri- 
ous from neutral introgressed variants. This 
prediction, documented across diverse taxa 
(29, 31, 32), was also observed in baboons (Fig. 
2E; Spearman’s rho = 0.127, P = 148 x 10°*°), 
with a magnitude similar to that reported for 
Neanderthal and Denisovan gene flow into 
modern humans [Fig. 2F; Spearman’s rho = 
0.17 and 0.14 for Neanderthal and Denisovan 
ancestry, respectively (29)]. 

To investigate these patterns further, we 
took advantage of the dynamic history of 
admixture within the Amboseli population. 
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At the beginning of monitoring in 1971, all 
Amboseli animals were considered to be yellow 
baboons (33). Phenotypically anubis and ad- 
mixed animals immigrated into the population 
starting in 1982, and the proportion of hybrid 
animals increased over the following decades 
(18, 34). Whole-genome data recapitulate these 
patterns, documenting an increase of 11.8% 
anubis ancestry from 1971 (23.1 to 29.6%) 
to 2020 (34.9 to 41.4%) (Fig. 3A). However, 
animals with no known anubis ancestors 
during the 50-year field study were also 
clearly admixed (Fig. 3B). Additionally, although 
immigrant males were more anubis-like than 
the study population as a whole (Fig. 3A), one 
immigrant male was among the most yellow- 
like in our sample (78.8% yellow ancestry), 
indicating ongoing gene flow involving both 
parental taxa. 

The Amboseli population today therefore 
contains individuals that descend from ancient, 
unobserved admixture events as well as those 
affected by recent hybridization, generating a 
bimodal distribution of genome-wide ancestry 
(Fig. 3C) (15). By integrating local ancestry calls, 
pedigree information, and field observations, 
we identified 188 “recently” admixed individuals 
whose ancestors include at least one anubis-like 
immigrant within the last seven generations 
(mean = 1.7 generations, although these animals 
are not classical F, or F, hybrids because his- 
torical gene flow is involved). We also classified 
214 baboons as “historically” admixed, as their 
genomes only contain anubis ancestry from 
before 1971. Forty baboons could not be as- 
signed to either hybrid class (73). Based on a 
single-pulse model of admixture using DATES 
(35), historical admixture is dated to a mean of 
283 (+242 SD) generations ago (n = 7 high- 
coverage genomes), in contrast to 5 and 21 gener- 
ations ago for two recent hybrids sequenced to 
high coverage. 

Stratifying individuals in the dataset by 
admixture history reveals that signatures of 
selection against introgression are driven by 
historical admixture (i.e., genomes sampled 
dozens to hundreds of generations post- 
contact). Historically admixed individuals are 
more depleted of anubis ancestry in highly 
diverged and low B-value regions of the genome 
than recently admixed animals (Fig. 3, D and E). 
Further, the relationship between anubis ances- 
try and recombination rate is exclusive to the 
historically admixed dataset, even when recom- 
bination rates are measured on chromosome- 
level scales (Fig. 3F and table S3) (13). The 
weaker signature of selection in recent hybrids 
likely reflects intermittent gene flow in the 
most recent generations and stochastic inher- 
itance processes. In contrast, sufficient gener- 
ations have passed since historical admixture 
to break apart large introgressed haplotypes, 
allowing us to observe nonrandom patterns of 
ancestry across the genome. This result em- 


phasizes the importance of complementing field 
observations with genomic data, which provide 
insight into selective processes that operate over 
time scales longer than even the longest-running 
field studies. 


Selection against regulatory divergence 


Analyses of human-Neanderthal admixture 
suggest a consistent pattern of selection against 
regulatory variants (36). If so, the introgressed 
regions that persist in modern humans have 
likely been purged of many alleles with large 
regulatory effects (37, 38). However, direct com- 
parisons between the effect sizes of retained 
versus lost archaic alleles are difficult, as 
only a fraction of archaic hominin alleles (e.g., 
Neanderthal, Denisovan, or other ghost line- 
ages) segregate in modern human genomes 
today (28, 39). Extant primate populations, 
where hybridization and selection are ongoing, 
provide an opportunity to test this hypothesis. 

To test for selection against gene regulatory 
divergence in baboons, we paired genetic an- 
cestry data with blood-derived RNA-sequencing 
data from 145 individuals (nm = 157 samples) 
(40-42) (table S1). This dataset includes whole 
blood and white blood cells, which were analyzed 
separately while controlling for age, sex, and 
kinship (73). Among 10,192 analyzed genes, we 
identified no significant associations between 
genome-wide ancestry and gene expression 
levels (10% false discovery rate). In contrast, 
local ancestry predicted gene expression 
levels for 20.1% (2046) of tested genes in one 
or both datasets (Fig. 4A), with concordant 
additive effects between datasets (Pearson’s 
R = 0.48, P < 10°?) and little evidence for 
non-additivity (73). 

If introgressed alleles that perturb gene 
regulation are a primary target of selection, 
we reasoned that selection should purge anubis 
ancestry near genes where ancestry strongly 
affects gene expression. In support of this 
prediction, the top 15% of genes with the lar- 
gest local ancestry effects on gene expression 
harbored 1.5% less anubis ancestry, on aver- 
age, than the bottom 15% of genes with the 
smallest local ancestry effects (Fig. 4B; paired 
t test, P = 1.10 x 10°, n = 442). This difference 
was exaggerated within historically admixed 
individuals (1.9% reduction, P = 1.26 x 10°, n = 
214; table S4). Further, the correlation between 
anubis ancestry and local recombination rate is 
larger for genes with the largest local ancestry 
effects than for those with the smallest (Fig. 4C; 
rhogie = 0.07 for the top and bottom 15% of 
genes, bootstrapped P = 0.027; table S4:). Com- 
bined with the depletion of introgressed se- 
quence in regulatory elements, these results 
support the hypothesis that introgressed alleles 
that affect gene regulation are nonrandomly 
purged after hybridization. They are there- 
fore consistent with the idea that natural 
selection removed archaic variants with large 
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Fig. 2. Selection against introgression in the Amboseli baboons mirrors 
patterns described for archaic hominin admixture. (A, C, and 

E) Proportion of introgressed (anubis) ancestry in Amboseli in 250-kb windows 
(n = 10,324 total windows) as a function of (A) fixed differences between 
yellow and anubis baboons (Spearman's rho = -0.119, P = 8.05 x 10°), (C) 
mean B value (rho = 0.168, P = 1.73 x 10°), and (E) mean recombination 
rate (rho = 0.127, P = 2.49 x 10°*°), divided into quintiles for visualization 
purposes only. Dashed gray lines show median anubis ancestry across all 


regulatory effects from the genomes of mod- 
ern humans (37). 
Predicting the genomic landscape of introgression 


Finally, we investigated our ability to predict 
the genomic locations most and least affected 
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recombination rate quantile 


recombination rate 


by introgression. We modeled mean anubis 
ancestry as a function of local recombination 
rate, single-nucleotide polymorphism density 
in the reference yellow and anubis populations, 
yellow-anubis genetic divergence, gene and 
enhancer content, linked selection, and local 


baboon 
——-— human 


rank-ordered number of fixed differences 


baboon 
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baboon 
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rank-ordered recombination rate 


windows. (B, D, and F) Predicted relationships between introgressed ancestry 
and all three measures are observed for both anubis ancestry in the Amboseli 
baboons (solid lines) and Neanderthal ancestry in modern human genomes 
(dashed lines) (27-29), consistent with selection against introgression. Panels 
show the relationship between introgressed ancestry and the rank-ordered (B) 
number of fixed differences, (D) mean B statistic, and (F) mean local 
recombination rate. Mean introgressed ancestry is centered on O and divided by 
the standard deviation for each species to facilitate comparison. 


ancestry-associated gene expression in blood. 
We iteratively trained an elastic net regression 
model on nonoverlapping 250-kb windows of 
the genome, representing 75% of the genome, 
and applied the model to a test set of windows 
in the remaining 25% (13). We found that our 
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Fig. 3. Recent and historical hybrid ancestry in Amboseli. (A) Mean 
genome-wide anubis ancestry in the Amboseli population has increased 
since the 1970s. Numbers above the x axis indicate the number of individuals 
used to calculate annual ancestry (black, all individuals; green, male 
immigrants). (B) Pedigree and ancestry estimates for example historical (left) 
and recent (right) hybrids. Pedigree individuals with resequencing data 

are colored by ancestry. The two examples share a maternal grandmother and 
were born a few years apart [yellow and bright green asterisks in (A)]. The 
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father of the recent hybrid immigrated in 2004 [olive green asterisk in (A)]. 
(C) Genome-wide anubis ancestry in Amboseli, with density plots overlaid for 
historical and recently admixed individuals. (D to F) The relationships 
between introgressed anubis ancestry and the rank-ordered (D) number of 
fixed anubis-yellow differences, (E) mean B value, and (F) mean local 
recombination rate. All relationships are stronger for historical hybrids 

than for recent hybrids. Right panels show anubis ancestry within each 
dataset mean-centered to 0. 
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Fig. 4. Selection against gene regulatory divergence and prediction of 
local introgression levels. (A) Local ancestry predicts gene expression in 
the Amboseli population, as depicted for an example gene (MRPL2). Inset: 
quantile-quantile plot comparing P distributions for observed local ancestry 
effects (y axis) to a permutation-based null (x axis). (B) Difference in 
introgressed anubis ancestry between genes with the smallest versus 
largest local ancestry effects on gene expression. Violin plots show the 
distribution of differences across individuals; boxplots show the median 
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and interquartile range (all P < 0.05; table S4). (C) Correlation between 
anubis introgression and recombination rate calculated for sets of genes 
with the largest (blue) versus smallest (magenta) local ancestry absolute 
effect sizes. Asterisks denote bootstrapped P < 0.05 (table S4); error 

bars show SD. (D) Distribution of effect sizes for features that consistently 
predict the extent of anubis introgression in Amboseli baboon genomes 
(table S5). The number of single-nucleotide variants (SNVs) is derived 
from unadmixed yellow and anubis baboon genomes. 


0.265 + 0.017 SD versus 0.177 + 0.018 SD, 


recombination rate P values < 10°’, B = -2.965 x 


correlated with observed levels of anubis 
ancestry in the test sets (mean Pearson’s R = 
0.254 + 0.016 SD versus 0.014 + 0.011 SD for 
models fit to permuted data), with frequent 
contributions from features capturing local 
recombination rate, linked selection, genetic 
variation, and sequence divergence (Fig. 4D 
and table S5). We consistently predicted anubis 
ancestry more accurately in historical hybrids 
than in recent hybrids (mean Pearson’s R = 
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bootstrapped P < 107°). 

Our longitudinal data also indicate that in- 
creases in anubis ancestry across the 50-year 
field study are nonrandomly distributed 
throughout the genome. Controlling for the 
starting level of anubis ancestry in 1979, 100-kb 
windows characterized by lower Fs; and higher 
recombination rates experienced larger increases 
in anubis ancestry between 1979 and 2020, 
although both effect sizes were small (Fs; and 


10* and 1.020 x 10~+, respectively; n = 25,797 
windows; table S6). B statistic values did not 
predict temporal change in anubis ancestry 
independently of recombination rate. 


Divergence and hybridization in primates 


Our genomic analysis reveals evidence for 
selection against admixture that is remarkably 
consistent with results obtained for archaic 
introgression in humans. Our results also support 
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a hypothesis that can only be indirectly tested 
in our own lineage: that natural selection has 
acted to eliminate introgressed alleles that 
strongly perturb gene regulation (37). These 
results contrast with the behavioral and life 
history evidence to date in Amboseli—one of 
the largest and longest-running primate field 
sites in the world—which indicates that hybrid 
baboons suffer no obvious fitness costs (19-21). 
Our results identify subtle selection against 
hybridization that may help to explain the 
maintenance of primate taxonomic integrity 
in the face of frequent interspecific gene flow 
(4, 5). Ultimately, the outcome of this process 
will depend on the relative balance among this 
selection pressure, possible advantages to intro- 
gressed ancestry, migration rates, and demo- 
graphic stochasticity—potentially explaining 
cases of nuclear swamping in baboons despite 
costs to hybridization (16). 

The mode of selection against hybrids is 
unclear. Unlike in humans, hybridization load 
is unlikely to explain our results: Yellow and 
anubis baboons harbor similar levels of genetic 
diversity compared to humans and Neanderthals, 
which differ more [<50% difference in baboons 
versus a >3-fold difference between humans 
and Neanderthals (6, 14, 15, 27)]. Both hybrid 
incompatibilities and ecological selection, how- 
ever, could play a role. For example, some 
reports suggest that anubis and yellow ba- 
boons occupy distinct climatic niches (43). 
Previously described assortative mating by an- 
cestry in the Amboseli baboons (20) may also 
limit introgression. Understanding the genetic 
and phenotypic mechanisms that influence 
interspecific gene flow, including the role of the 
X chromosome and adaptive introgression, re- 
mains an important goal for future work. 

Combined, our findings illustrate the impor- 
tance of contextualizing genomic data with 
phenotypic and demographic information to 
understand the evolutionary dynamics of 
admixture. Genomes harbor information about 
historical processes that stretch back many 
generations, and can capture subtle signatures 
of selection that may not be obvious in natural 
populations where demographic stochasticity 
is high, sample sizes are modest, and the 
specific phenotypes under selection may be 
unknown. Conversely, field data reveal the 
range of phenotypic and fitness outcomes 
that are compatible with genomic signatures 
of selection. Indeed, genomic evidence alone 
has led some researchers to posit that the 
costs of modern human-archaic hominin inter- 
breeding must have been high, reflecting species 
at the brink of reproductive incompatibility 
(44, 45). Our results point to the limits of these 
inferences by indicating that qualitatively 
similar evidence for selection against intro- 
gression can be compatible with primate hy- 
brids that thrive (19-21). This work therefore 
highlights the crucial role of other primates 
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for understanding human evolution, espe- 
cially for phenomena that are impossible to 
study in our lineage alone. 
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Multimodal perception links cellular state to 
decision-making in single cells 


Bernhard A. Kramer*, Jacobo Sarabia del Castillo’, Lucas Pelkmans’* 


Individual cells make decisions that are adapted to their internal state and surroundings, 
but how cells can reliably do this remains unclear. To study the information processing 
capacity of human cells, we conducted multiplexed quantification of signaling responses and 
markers of the cellular state. Signaling nodes in a network displayed adaptive information 
processing, which led to heterogeneous growth factor responses and enabled nodes to 
capture partially nonredundant information about the cellular state. Collectively, as a 
multimodal percept this gives individual cells a large information processing capacity to 
accurately place growth factor concentration within the context of their cellular state 

and make cellular state-dependent decisions. Heterogeneity and complexity in signaling 
networks may have coevolved to enable specific and context-aware cellular decision-making 


in a multicellular setting. 


ontextual decision-making by cells in a 

collective is a hallmark of multicellular 

systems (J—4). To achieve context-aware 

behavior, individual cells must integrate 

the input they receive from growth factors 
with complex information on their physico- 
chemical state. Cells perceive this information 
through activation of intracellular signaling 
networks; however, individual signaling nodes 
in such networks are thought to have low 
capacity for processing information as a result 
of their highly variable growth factor responses 
in single cells (5—7). It thus remains largely 
unknown if and how individual cells can process 
a large amount of information in a contextual 
manner. 

We explored the possibility that variable 
growth factor responses do not reflect a 
limited information processing capacity but 
instead represent adaptive information pro- 
cessing. In adaptive information processing 
the response of a signaling node in an indi- 
vidual cell is adapted to the physicochemical 
state of the cell and its surroundings (here 
collectively referred to as the cellular state), 
through mechanisms by which the cellular 
state controls the activities of signaling nodes 
(8, 9). This implies that signaling responses not 
only capture information about the amount of 
growth factor a cell is exposed to but also—and 
perhaps primarily—obtain information about 
its cellular state. If the activation of different 
signaling nodes in a network is dependent on 
different properties of the cellular state then 
each node would carry partially nonredun- 
dant information. As a whole, the network 
could then generate a multimodal percept 
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that captures a comprehensive picture of 
a cell’s multicellular context and internal 
state, facilitating accurate and contextual 
decision-making. 


Results 
Multiplexed quantitative imaging of signaling 
and cellular state 


To obtain multiple readouts of the cellular 
state in addition to signaling responses we 
applied 4i—a high-resolution multiplexing 
technology on the basis of iterative staining 
and elution of antibodies (10)—to human epi- 
thelial cells (184A1). After 4 days of growth, 
cells were deprived of serum and growth factors 
for 12 hours and subsequently exposed to five 
different concentrations of epidermal growth 
factor (EGF) for 5 min (Fig. 1A). After 30-plex 
4i, cell segmentation, and quality control, the 
dataset contained ~8000 individual cells per 
replicate (7 = 3) and condition (Fig. 1B and fig. 
S1, A to C), showing high technical and _bio- 
logical reproducibility (fig. S1, D to F). Images 
of cells reveal the highly heterogeneous nature 
of acute signaling responses as well as that of 
the cellular state (Fig. 1C). Quantifying the 
abundances of three signaling responses and 
three cellular state markers in every cell and 
comparing single-cell distributions of the five 
conditions revealed that although signaling 
responses typically display changing levels 
with increasing amounts of EGF in either a 
gradual or switch-like bimodal manner, their 
responses were highly heterogeneous between 
cells. This results in overlapping single-cell 
distributions between different doses of EGF 
(Fig. 1D and fig. S2, A and B). By contrast, 
cellular state markers did not show any change 
(Fig. 1D). A systematic assessment of all single- 
cell features quantified from the images re- 
vealed that ten signaling responses downstream 
of EGF displayed significant changes whereas 
650 features of the cellular state did not change 


during a 5 min exposure to EGF (Fig. 1E). These 
features quantify or act as proxies of properties 
across multiple spatial scales such as relative cell 
positioning within a population, position in the 
cell cycle, and subcellular textures of organelles 
(Fig. IF and fig. S2, C and D). Thus although 
the cellular state changes in response to EGF 
stimulation (12), this occurs at longer time- 
scales. Inhibitors of signaling nodes abrogated 
signaling responses to EGF but did not change 
markers of the cellular state (fig. S2E). We can 
thus test whether the heterogeneity seen in 
acute EGF-induced signaling responses is 
linked to the preexisting heterogeneity in 
cellular states. 


The preexisting cellular state landscape shapes 
signaling responses in single cells 


A projection of the multidimensional cellular 
state space into a two-dimensional (2D) land- 
scape (12) was largely continuous except for 
two parts that reflect G1 and G2 of the cell 
cycle (Fig. 1F and fig. S2, F and G). Signaling 
responses of single cells distribute in differ- 
ent patterns across this landscape [Fig. 2A 
(middle) and fig. S4A], which was accurately 
predicted by features of the cellular state (Fig. 
2A and fig. S3). This required multiple features 
in different combinations for different signal- 
ing nodes (fig. S4, B to D). For instance, al- 
though local cell density was an important 
feature for most signaling responses, the abun- 
dance of Paxillin—a proxy for cell spreading— 
was particularly important for predicting levels 
of pS6 in single cells, the abundance of Sec13 
was particularly important for predicting pERK, 
and the amount of nuclear Yap1 was important 
for predicting nuclear translocation of FoxO3a 
[Fig. 2A (middle) and fig. S4B]. Thus the cellular 
state has a specific multivariate effect on sig- 
naling responses in individual cells. It also 
predicts switch-like bimodal response properties 
(being either a low or high responder) with high 
accuracy, as shown by some signaling nodes at 
low doses of EGF (Fig. 2B, Fig. 1D, fig. S2B, 
and fig. S4E), and achieves high prediction 
accuracy across the full range of EGF doses 
tested (Fig. 2C, fig. S4F, and fig. $5). As a result, 
variation in signaling responses induced by 
different cellular states is typically larger than 
variation induced by different concentrations 
of EGF (Fig. 2, D and E). 


Multimodal perception accurately decodes 
EGF concentration 


The above suggests that the sensitivity of sig- 
naling nodes—quantified by the effective 
concentration of EGF at which a node is half- 
maximally activated (EC;9)—is adapted to the 
cellular state. To study this, we defined 18 cel- 
lular state classes and compared their dose- 
response curves (Fig. 3A and fig. S6, A to E). 
This showed that, for instance, the ECs, of PERK 
differed by a factor of at least three between 
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Fig. L Acute signaling responses and preexisting cellular states across 
spatial scales. (A) Experimental workflow. (B) 4i, imaging setup and quality 
control pipeline. (©) Two combinations of three channels of one imaging site 
(top) and zoom-in of cells (bottom). Scale bars: 7.5 um. (D) Density 
distributions of mean intensities in either the cytoplasm or nucleus of six 
markers. (E) Systematic identification of features that change or do not 
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change upon EGF treatment. P value was estimated with a two-tailed t test 
comparing replicates treated with 100 ng/ml to 0 ng/ml EGF. (F) Single-cell 
nonresponding features (143 principal components) define the cellular state 
space, projected using UMAP embedding. (Left) Schematic of features 
derived from multiple spatial scales. (Middle) Position of cells from each 
condition. (Right) Values from selected cellular state features. 


cellular states (Fig. 3B), with class 2 cells 
(small cells grown in densely populated 
areas and having abundant early endosomes) 
showing a particularly high ECs, and class 
15 cells (grown in sparsely populated re- 
gions and having nuclear Yap1) a low ECso. 
Inferring single-cell EC; 9s revealed that the 
sensitivity of each signaling node was ad- 
apted to the cellular state in distinct ways and 
for different ranges of EGF concentrations 
(Fig. 3C and fig. S6F). 

To test whether signaling node-specific, 
cellular state-conditioned sensitivity to EGF 
could provide enough information to individual 
cells to perceive a wide range of EGF concen- 
trations in a contextual manner, we quanti- 
fied the amount of mutual information between 
the signaling responses of individual cells and 
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the five doses of EGF they are exposed to (73). 
Perfect decoding ability would show mutual 
information of log2(5) or 2.3 bits. When cells 
were considered to use information from one 
node (unimodal perception) and it was as- 
sumed that their signaling was not condi- 
tioned to the cellular state (noncontextual), 
cells could only distinguish large differences 
between individual doses (Fig. 3D) and had 
little decoding capacity (0.7 bits) (Fig. 3E and 
fig. S7, A to C). When we considered that the 
cellular state affects the signaling response 
(contextual), smaller differences between indi- 
vidual doses could be distinguished and the 
decoding capacity of individual cells increased 
(1.2 bits). When cells were considered to use 
information from multiple nodes (multimodal 
perception) (J4) they could approach perfect 


decoding (2.1 bits), but only when condition- 
ing by the cellular state was considered (Fig. 
3, D to E). Inhibitor experiments showed 
that a multimodal response was required to 
achieve the greatest accuracy. Inhibiting 
either MEK or AKT reduced the perception 
accuracy of EGF concentration and inhibiting 
both led to a stronger reduction (Fig. 3F and 
fig. 87, D to F). This indicates that multi- 
modal perception could enable individual cells 
to accurately distinguish a range of EGF con- 
centrations in a manner that is conditioned to 
the cellular state. 


Multimodal perception comprehensively maps 
cellular state space 


Conditioning by the cellular state implies that 
the multimodal response of an individual 
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(A) (Left) Regression approach. (Upper middle) Measured and predicted 
levels of three signaling response markers. R°, explained variance. (Lower 
middle) Dominance analysis of cellular state features in explaining signaling 
responses. Yellow boxes indicate specific examples. (Right) Side-by-side 
projection of measured and predicted levels of pERK. Scale bars: 15 um. 
(B) (Left) Logistic regression. (Right) Confusion matrices for class 
predictions of bimodal responses. (C) (Left) Measured and predicted 
levels of pERK across all concentrations of EGF. (Right) R? for all 


cell to EGF carries a considerable amount 
of information about the cellular state, espe- 
cially if nodes in a network not only contain 
redundant but also unique information and 
can act in a synergistic manner (Fig. 4A). For 
instance, the cytoplasmic abundances of pERK 
and pMTOR in cells show a strong positive 
correlation, suggesting redundancy, which is 
observed to various degrees for all nodes in 
the network (Fig. 4A and fig. S8A). However, 
the amounts of pERK and pMTOR also scale 
with cellular state properties that are distinct 
when compared with the other as illustrated 
for Secl3 and HSP60 (Fig. 4A). This can be 
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revealed by partial correlation analysis, indi- 
cating that the amounts of pERK scale with 
the abundance of Sec13 whereas the amounts 
of pMTOR scale with the abundance of HSP60, 
and can be appreciated in images of individual 
cells that differ in the abundance of one but 
not the other cellular state marker (Fig. 4A). 
To estimate the amount of information pERK 
and pMTOR carry about these cellular state 
properties and to decompose it into redundant 
(captured by either node), unique (captured by 
one node), and synergistic components (cap- 
tured only by the combined response of both 
nodes), we employed partial information de- 


responses induced by the preexisting cellular state- and variation-induced 
ment. (Left) pERK at three EGF concentrations predicted by 
cellular state. (Right) pERK at three EGF concentrations corrected for 

in cellular state. (E) (Left) Quantification of cellular state- and 
EGF-induced variation as described in (D). (Right) Mean variation 

induced by all concentration changes in EGF (x axis) versus the mean 
induced by cellular state at all concentrations of EGF (y axis). 


composition (PID) (15). pERK and pMTOR 
capture redundant but also unique inform- 
ation about Secl3 and HSP60 abundance, 
respectively (Fig. 4A, right bar graphs). For 
the latter, pERK and pMTOR also capture 
synergistic information. 

We next applied this analysis to all tested 
nodes (Fig. 4, B and C, and fig. S8B), showing 
that each node displays some distinct scaling 
with various cellular state properties (Fig. 4B). 
Many of these effects have previously been 
reported such as (negative) pMEK scaling 
with local cell density (16), pAKT scaling 
with cell area (8), pRSK scaling with the 
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Fig. 3. Contextual multimodal perception of EGF concentration. (A) 
Cellular state classes (highest membership coefficient) from fuzzy clustering 
on the cellular state landscape. (B) (Left) dose-response curves of pERK 
for the whole population (gray) and for three cellular state classes. Dots, 
individual replicates. (Bottom) Values of cellular state features. Yellow boxes 
indicate examples. (Right) Cellular state markers and (left) pERK in 
representative cells of three state classes outlined in green. Scale bars: 

2.5 wm. (C) Single-cell ECs9 was calculated by combining measured 


responses with predicted (regression). (Left) Single-cell ECs9 values for each 
signaling node. (Right) ECs values on cellular state landscape for three 
signaling nodes. (D) Capacity to distinguish two EGF concentrations based 
on different perception modes. Decoding capacity is calculated from 
predicted and true EGF exposure labels. (E) Decoding capacity of EGF 
concentrations for different perception modes for all concentrations 
simultaneously. (F) Decoding capacity of contextual multimodal perception 
in the presence of different inhibitors. 


transcriptional state of cells (pPolII) (17), 
pGSK3b scaling with the abundance of late 
endosomes (VPS35) (78), and pEGFR scaling 
with the abundance of Paxillin (19) (Fig. 4B). 
Only by analyzing these multiple connec- 
tions between signaling responses and cellu- 
lar state properties collectively in the same 
cells and across many cells can we observe 
that nodes capture not only redundant but 
also unique and synergistic information about 
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the cellular state (Fig. 4C and fig. S9). To test 
whether this information propagates through 
the network, we used inhibitors. The redun- 
dant information about cellular state fea- 
tures between upstream (pAKT and pERK) 
and downstream nodes (respectively nuclear 
FoxO38a and pS6) was reduced upon inhibi- 
tion of the upstream node (Fig. 4D), whereas 
the unique information about the cellular 
state captured by the downstream node was 


not reduced and sometimes increased (fig. 
S8C). For instance, although nuclear deple- 
tion of FoxO3a was reduced when AKT was 
inhibited it was still heterogeneous and cor- 
related more strongly with position in the 
cell cycle (fig. S8D). Thus, cellular state infor- 
mation can propagate through the network 
and nodes can integrate information from 
multiple sources depending on the activity of 
other nodes. 
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Fig 4. Multimodal perception captures the diversity of cellular states. 
(A) (Left) Encoding regimes of cellular state information by signaling 
nodes. (Middle left) Scatterplot of pMTOR and pERK abundances with 
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ances visualized. (Inlets) Partial 
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information about the cellular state (averaged across all features) that 
signaling nodes encode (pairwise comparison). (Bottom) Encoding cellular 
state information as average across all pairwise combinations of signaling 
nodes. (D) Transfer of cellular state information from pAKT and pERK to 
FoxO3a and pS6 in presence of inhibitors and amount of redundant 
information on the indicated cellular state properties. (E) (Top) 
Calculating neighbor similarity between cellular state space and 

different signaling spaces. (Bottom) Similarity of cellular state space 

and different signaling neighborhoods. (F) Cellular state classes 

(as in Fig. 3A) predicted for each cell (logistic classifier) from indicated 
signaling spaces. (Inlets) Confusion matrices. Color indicates the 

fraction of cells assigned to a class. Square size indicates posterior 
probability of cells with that prediction. 
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To analyze whether collectively, as a multi- 
modal percept, signaling nodes can compre- 
hensively map cellular state space and inform 
an individual cell on its position in this space, 
we quantified the overlap between the statis- 
tical neighbors of an individual cell in cellular 
state space with its statistical neighbors in 
signaling space (Fig. 4E). Unimodal percep- 
tion can only map neighborhoods within small 


responding state space (16h EGF) 


regions of the cellular state space (Fig. 4E and 
fig. S10A) whereas a multimodal percept based 
on 10 nodes can map the neighborhood across 
the whole cellular state space (Fig. 4E and fig. 
S10B). The information gain in multimodal 
perception is particularly high for the first six 
nodes in the network (fig. S10C). Furthermore, 
only multimodal perception can accurately 
predict to which cellular state class an individual 


cell belongs for a large diversity of cellular states 
(Fig. 4F and fig. S10, D and E). 


Multimodal perception links cellular state to 
decision-making 

To test whether multimodal perception is 
used by cells to make context-aware decisions 
we exposed cells to longer EGF stimulation. To 
avoid problems resulting from the fact that 
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Fig. 5. Multimodal perception links cellular state to decision-making. 
(A) (Left) Position in cellular state space conditions signaling response. 

On long time scales, this changes the position in state space. (Top right) 
UMAP of the responding cellular state. (Bottom right) UMAP of the 
nonresponding cellular state. (B) Fractions of cellular state features related 
to properties and activities that change after 16 hours of EGF. (C) (Left) 
Classification of pRB status. (Middle) Fraction pRB status. (Right) 
Representative images. (D) (Left) Measured and predicted (logistic 
regression using indicated variables) pRB status on the cellular state 
landscape. Accuracy (middle), and ROC curves (left) of the classifier. 

(E) (Left) Perturbations that only affect position in signaling space and not 
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some properties of the cellular state may 
change during longer stimulation times, we 
defined a nonresponding cellular state space 
(Fig. 5A and fig. S11A). This resulted in the 
exclusion of about half of all cellular state 
features and enrichment of features related to 
the transcriptional state and cell proliferation 
but also to cell morphology, the cytoskeleton, 
and membrane trafficking (Fig. 5B and fig. S11A). 
One of these is phosphorylated retinoblas- 
toma protein (pRB), which is under the con- 
trol of multiple signaling nodes (17, 20). RB is 
phosphorylated in actively proliferating cells 
and marks an EGF dose-dependent decision- 
making event to reenter the cell cycle (Fig. 5C). 
Projecting the status of individual cells as 
pRB-negative (pRB-) or pRB-positive (pRB+) 
after 16 hours of EGF induction on the cellular 
state landscape showed specific patterns across 
different doses of EGF, which were accurately 
predicted by using properties of the cellular 
state (Fig. 5D and fig. S11B). pRB status was 
also accurately predicted by the multimodal 
percept of single cells but less accurately by 
the responses of individual signaling nodes 
or their combination (Fig. 5D and fig. S11B). 
Thus, both the position in cellular state space 
as well as the position in multimodal signaling 
space allows for an accurate prediction of 
the response of single cells to reenter the 
cell cycle. 

We next treated cells with inhibitors of AKT 
(AKTi) and MEK (MEKi). At low concentrations 
these inhibitors did not affect the positions of 
cells in cellular state space but did so in mul- 
timodal signaling space (fig. S11, C and D), 
both alone and in combination, resulting in 
altered fractions of pRB status (Fig. 5E). These 
altered responses were not accurately predicted 
by models based on the cellular state if trained 
on unperturbed cells but were accurately pre- 
dicted by models based on the multimodal 
percept (Fig. 5E, right bar graph). Models based 
on the cellular state trained on perturbed cells 
were also accurate (fig. SIIE). Thus, the position 
in multimodal signaling space couples cellular 
state to decision-making, which is altered by 
the inhibitors. To explore this we projected 
PRB status of both untreated and treated cells 
in the multimodal signaling landscape, reveal- 
ing a sharp decision boundary (Fig. 5F). We 
then defined five classes of cells from cellular 
state space each with a distinct decision-making 
profile across perturbations. These classes oc- 
cupy different regions in cellular state space and 
consequently different regions in multimodal 
signaling space (Fig. 5F). Although their posi- 
tions in cellular state space remained the same 
during inhibitor treatment, their positions in 
multimodal signaling space changed, affecting 
the response of these cells (Fig. 5G and fig. SIIF). 
For instance class 2 cells, which grew in regions 
of low local cell density and were large with 
abundant late endosomes, were located on the 
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PRB+ side of the decision boundary (Fig. 5F). 
Individually inhibiting AKT or MEK affected 
their position in multimodal signaling space 
but did not result in crossing of the pRB 
decision boundary (Fig. 5G). Inhibiting both 
AKT and MEK resulted in altered pRB status 
(Fig. 5G). Thus, cells in this state were only 
prevented from re-entering the cell cycle by 
inhibiting both AKT and MEK. By contrast 
class 4 cells, which grew in regions of high 
local cell density, were small and had rela- 
tively few endosomes but an abundant Golgi 
complex, and were located on the pRB- side 
of the decision boundary in the unperturbed 
condition (Fig. 5F). Although all treatments 
affected their position in multimodal signal- 
ing space they only became pRB+ upon in- 
hibition of AKT (Fig. 5G). Thus, cells in this 
state remained nonproliferative upon EGF 
stimulation but became aberrantly prolifera- 
tive upon AKT inhibition. 


Discussion 


We have shown that the heterogeneity in 
acute signaling responses of individual cells 
contains partially nonredundant information 
about the cellular state that influences the 
growth factor response. The cellular state has 
a stronger effect on these responses than changes 
in growth factor concentration and thus repre- 
sents an important source of information to 
predict these responses. Collectively, as a mul- 
timodal percept this enables individual cells 
to accurately sense a range of growth factor 
concentrations and to integrate this with their 
cellular state to make cellular state-dependent 
decisions. The cellular state is thus at least as 
relevant as growth factor concentration in de- 
termining cellular responses. This suggests that 
one purpose of cellular structures in controlling 
the activation of signaling nodes (8, 9) is to inject 
information about the cellular state into the 
decision-making process. It may also help 
explain why signaling responses are heteroge- 
neous and signaling networks have a certain 
complexity. Although the redundant elements 
in network complexity can counteract uncer- 
tainty in individual responses (5) the nonre- 
dundant and synergistic elements can enable 
adaptive responses of multiple nodes to act 
as a multimodal percept that captures a large 
amount of information about the cellular state. 
This suggests that a collective of cells repeatedly 
generates the same spectrum of single-cell 
responses through generating the same land- 
scape of cellular states. This may be important 
during development in which spatial effects 
and self-organization could drive the robust 
formation of such landscapes, providing the 
appropriate context for morphogens to induce 
a range of cellular decisions even in the ab- 
sence of well-defined gradients (21, 22). That 
the cellular state determines the heteroge- 
neous response of signaling nodes to clinically 


tested inhibitors resulting in different and 
sometimes unwanted state-dependent decisions 
may also be relevant for the treatment of di- 
seases such as cancer (23). 
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Carbene reactivity from alkyl and aryl aldehydes 


Lumin Zhang, Bethany M. DeMuynckt, Alyson N. Panequey, Joy E. Rutherford}, David A. Nagib* 


Carbenes are highly enabling reactive intermediates that facilitate a diverse range of otherwise 
inaccessible chemistry, including small-ring formation and insertion into strong o bonds. To access such 
valuable reactivity, reagents with high entropic or enthalpic driving forces are often used, including 
explosive (diazo) or unstable (gem-dihalo) compounds. Here, we report that common aldehydes are 
readily converted (via stable a-acyloxy halide intermediates) to electronically diverse (donor or neutral) 
carbenes to facilitate >10 reaction classes. This strategy enables safe reactivity of nonstabilized 
carbenes from alkyl, aryl, and formyl aldehydes via zinc carbenoids. Earth-abundant metal salts 
[iron(Il) chloride (FeClz), cobalt(Il) chloride (CoCl,), copper(l) chloride (CuCl)] are effective catalysts 
for these chemoselective carbene additions to o and 7 bonds. 


n organic synthesis, several important 

classes of chemical reactivity are exclu- 

sively mediated by carbenes and carbe- 

noids (Fig. 1A) (/-5). However, access to 

these versatile intermediates is often lim- 
ited by the need for highly energetic diazo- 
alkane precursors. Given the strong entropic 
and enthalpic driving forces inherent in the 
release of N, from these reagents, strict pre- 
cautions are necessary to prevent explosive, 
uncontrolled chain reactions (6). Strategies to 
address this underlying safety issue include 
flow chemistry and in situ diazotization (7-9). 
Carbonyl and arene stabilizing groups are also 
frequently used (J0), which influence carbene 
polarity and selectivity (17, 12). Yet nonstabi- 
lized alkyl carbenes (with eliminable o protons) 
are rarely accessed by diazoalkanes (13). 


A Carbenes enable diverse reactivity, but accessibility is an ongoing challenge B 
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Fig. 1. Strategies to harness carbene reactivity. (A) Synthetic approaches 


In acomplementary approach, Simmons- 
Smith cyclopropanations by Zn carbenoids 
(from gem-dihalides) enable incorporation of 
the smallest divalent carbon, CH, (/4, 15). How- 
ever, a protons are rarely tolerated owing to 
1,2-H migration of the Zn carbenoid (16). De- 
spite elegant solutions for Matteson rearrange- 
ment by Li-boronate carbenoids (17-20), 
Mo-mediated ketone deoxygenation (2/, 22), and 
Au-catalyzed alkyne cyclizations (23, 24), there 
remains no general approach to access non- 
stabilized alkyl carbenes or for their use in the 
wide range of carbene reactivity available to 
stabilized diazo reagents (25). 

The carbonyl is an ideal carbene precursor, 
because it is highly accessible, both commer- 
cially and synthetically. Toward this goal, we 
were inspired by the classic Clemmensen re- 


duction of carbonyls with Zn(Hg) and HCl 
(26, 27). Motherwell also harnessed this Zn 
carbenoid reactivity to deoxygenate ketones 
by 1,2-H migration of siloxy analogs (28). 
However, this tendency toward a-elimination 
precludes access to the broad range of diazo- 
based carbene reactivity (29). 

In designing a general catalytic strategy to 
convert carbonyls to electronically diverse 
carbenes (Fig. 1B), we sought to develop a safe, 
stable carbenoid precursor that does not rely 
on activation by highly eliminable diazo, dihalo, 
or siloxy groups. Instead, we were cognizant 
that acyl chloride (AcCl) readily adds to car- 
bonyls, yielding stable a-acyloxy halides, even 
with enolizable aldehydes (30). We previ- 
ously showed that acyl iodide (AcI) adducts of 
carbonyls enable distinct ketyl radical reac- 
tivity by atom- or electron-transfer reduction 
mechanisms (3/, 32). By contrast, we hypothe- 
sized that the more stable pivaloyl chloride 
(PivCl) adduct A may prevent the radical path- 
way and permit chemoselective formation of 
a-acyloxy Zn carbenoid B (33). Importantly, 
by accessing this key intermediate in the ab- 
sence of strong acids, we proposed that the 
controlled o-acyloxy elimination by base metal 
catalysts could form reactive, metal carbene C. 
We anticipated that catalyst influence could 
also impart distinct reactivity compared with 
simple Zn carbenoids. Ideally, the resulting 
(i) carbene dimerization, (ii) o bond insertion, 
or (iii) small-ring formation would be chemo- 
selectively dictated by these catalysts and 
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of carbene precursors are readily prepared and exhibit substantially 
improved safety profiles versus diazo reagents. Ar, aryl; Bz, benzoyl; EWG, 
electron-withdrawing group; M, metal catalyst; Ph, phenyl; Piv, pivaloyl; 

X, Y, and Z, generic atoms. 
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override innate 1,2-rearrangements of alkyl 
Zn carbenoids. 

Another key design element is that several 
classes of these carbene precursors are easily 
prepared, stable for months in cold storage, 
and handled safely at large scale. As shown 
in Fig. 1C, we found that less-electrophilic 
carbonyls are best paired with more-electrophilic 
acyl halides (Cl < Br < I) to afford increasing- 
ly reactive carbene precursors (a-Cl < o-Br < 
a-I). For example, PivCl readily adds to ben- 
zaldehyde to generate bench-stable aryl 
carbene precursor 1. Yet, less-electrophilic, 
aliphatic aldehydes are best combined with 
benzoyl bromide (BzBr) to form nonstabilized 
alkyl analog 2, which is nonetheless stable to 
chromatography. Both reactions can be per- 
formed at >20-g scale without the safety con- 
cerns that are inherent to analogous diazo 


analogs, which violently decompose upon loss 
of N. near 100°C [with a decomposition en- 
thalpy (AHp) >200 kJ/mol, as measured by 
differential scanning calorimetry (DSC)] (6). 
Conversely, these a-acyloxy halides are con- 
siderably more stable, with drastically less 
decomposition enthalpies observed (AHp < 
30 kJ/mol in all cases) and only at much higher 
temperatures than typical reaction conditions 
(Tinit? 1, 155°C; 2, >300°C; 3, 235°C). Lastly, 
benzoyl iodide (BzI) readily combines with 
the least-reactive aldehyde, formaldehyde, to 
access methyl carbene precursor 3, which is a 
common protecting group that is commercially 
available in kilogram quantities and provides 
a safe alternative to diazomethane. 

To investigate our design, carbene dimer- 
ization was examined by subjecting benzal- 
dehyde to the proposed strategy, a three-stage 


sequence entailing (i) PivCl addition, (ii) Zn 
insertion, (iii) and metal catalysis. This proce- 
dure can either be completed successively in 
one pot or by preisolation of PivCl adduct 1. 
As shown in Fig. 2, several base metal salts 
(e.g., CuCl, CoCl,) enable efficient dimeriza- 
tion to stilbene 4, with notably high catalyst 
stereocontrol observed for CoCly [90%, 18:1 
E:Z, via 1], which is surprising given the ab- 
sence of bulky ligands that have been shown 
to be integral for selectivity in other car- 
benoid reactions (16). As further evidence 
of catalyst control, no reaction was observed 
with NiCl, or without catalyst. In addition 
to serving as a proof of concept of carbene 
reactivity, these mechanistic probes also dem- 
onstrate that electronically diverse substitu- 
ents (6 to 8) are well tolerated in the aryl 
carbene component. 


via via arylcarbene alkyl carbene 
carbene pas PS —— id Ph dimerization dimerization 
an ae waa = wes ec 
SIS — ay ee 2 bo A 
5% catalyst Pi "Sy po Phy > CuCl 76%, 3:1 7B%, 2:1 
THF, CH2Clp alkyl carbene dimer 5 (no a-elimination) CoCl, 90%, 18:1 99%, 2:1 
| H2N-NHTS 
20% catalyst PS no catalyst: 15% 
ee eee, ee Pr S vs PF RhslOAC)g: 44% 
(prone to elimination): Cs,CO;, 100°C 2 a 
, no dimer observed a-elimination 5' AgOTH: 61% 
aryl carbenes alky! carbenes via BzBr; Zn; 5% CoCi, 
via PivCl; Zn; 5% CoCl, - a Ar, 
- = ws eer 
oe SS ee 
, trom citronelial 
9, 55% 12, 62% 15, 65% nee 
Bi 6, 99% 21 EZ 21 EZ 21EZ 18, 73%, 1:1 EZ 
20:1 E:Z 
M 
je eS. A, 
” 10, 65% 
3:1 EZ from lithocholic acid 
F 7, 77% 19, 50%, 2:1 E:Z 
18:1 EZ 
SS Ar Cox. 
Bu 
2 8, 68% 11, 73% from ibuprofen from indometacin 
15:1 EZ 11 EZ 14, 70%, 2:1 E:Z 17, 65%, 4:1 E:Z 


functional group robustness: 5% CoCt + 1 (or 2) + additive 


A : 2 
A— R—Br RA—ci R=OH RN R=CN —- Za | 
Hy ASS aA ao 


Fig. 2. Carbene dimerization. Catalytic dimerization of aldehyde-derived aryl and 
alkyl carbenes enabled by several base metal salts. The synthetic generality 

and broad functional group tolerance of this carbene dimerization are shown. 
Reaction conditions for the aryl carbene are as follows: aldehyde (1 equiv), PivCl 
(2 equiv), 2% ZnClz, CHCls, 2 hours; Zn (2 equiv), LiCl (2 equiv), THF, 12 hours; 
5% CoCls, CH2Clo:THF, 2 hours. Reaction conditions for the alkyl carbene 
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& > 50% yield 4 (or 5}; > 50% additive recovery 


No 


are as follows: aldehyde (1 equiv), BzBr (1.2 equiv), 2% ZnBr2, CH2Cls, 2 hours; 
Zn (2 equiv), LiCl (2 equiv), THF, 12 hours; 5% CoCls, CH2Clo:THF, 2 hours. 

To evaluate robustness, experiments were performed with additive and either 
1 or 2 (1 equiv each). See the supplementary materials for full experimental 
details. Isolated yield and alkene stereochemistry (E:Z) are indicated. 

OTF, trifluoromethanesulfonate: p, para; ‘Bu, tert-butyl: Ts, toluenesulfonyl. 
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Fig. 3. Catalytic carbene reactivity. (A) Aryl carbene. Reaction conditions 
are as follows: aldehyde (2 equiv), PivCl (4 equiv), 4% ZnClz, CHCl3, 2 hours; 
Zn (2 equiv), LiCl (2 equiv), THF, 6 hours; 5% FeCl. or 2.5% Rh2(OAc)a, 
carbene trap (1 equiv), CH2Cle:THF, 12 hours. (B) Alkyl carbene. Reaction 
conditions are as follows: aldehyde (3 equiv), BzBr (3.6 equiv), 4% ZnBro, 
CH2Cle, 2 hours; Zn (3 equiv), LiCl (3 equiv), THF, 6 hours; 5% FeCls, carbene 
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trap (1 equiv), CH2Cla:THF, 12 hours. (©) Methyl carbene. Reaction conditions 
are as follows: iodide (3 equiv), Zn (3 equiv), 5% FeTPPCI, carbene trap 

(1 equiv), CHClo:THF, 4 hours. (D) Complex molecule applications. See 

the supplementary materials for full experimental details. Isolated yield, 
diastereomeric ratio (dr), and regioselectivity ratio (rr) are indicated. 

D, deuterium; EDG, electron-donating group; Et, ethyl; Me, methyl. 
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We next tested our hypothesis that alkyl 
carbene reactivity may be harnessed by this 
catalytic approach in a safer and more selec- 
tive manner than by diazoalkanes or dihalides. 
The distinctively broad scope of aliphatic 
aldehydes (even with eliminable a protons) 
that could be converted to carbenes was the 
highlight of this initial study. To this end, we 
found that CoCl, effectively catalyzes carbene 
dimerization from hydrocinnamaldehyde (to 
alkene 5 via 2: 99%, 2:1 E:Z). Notably, no 
a-elimination is observed, in contrast to the 
uncatalyzed siloxy variant, which is limited to 
a,B-unsaturated carbonyls to prevent such a 
pathway (28, 34). For further comparison, the 
diazo strategy was examined by subjecting a 


A 
base 


catalyst 


PivCl; 
f @ 
Zn; J, sp Fe 
FeCl, 65 ® Phy io} 
electrophilic nucleophilic 
3? 3® 


YS Me 


Fig. 4. Mechanistic experiments. (A) The carbene dimerization reactivity shown 
above is not accessible via diazoalkanes. (B) The proposed Zn carbenoid is 
validated by an acidic quench. Conversely, single-electron reduction by Mn 
affords kety! radical reactivity but not carbene dimerization. (C) Electrophilic 
character of carbene is illustrated by selective reactivity with nucleophilic 

traps; sulfonium ylide affords polarity-reversed reactivity with electrophiles. 
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Comparison with alky! diazo carbene reactivity 


PS pneite — then Peon NF 


carbene dimer 5 


NaH, rt; NaH, 60°C; Cs,CO5, rt; Cs,CO,, 60°C; 

5% cal 5% cat 5% cat 5% cat 

catalyst yieli6(5') yield 6(6") — yield 6 (5") ~—yleld & (") 
none 0% (0%) 1% (2%) 0% (0%) 0% (6%) 
Rha(OAc), 0% (0%) 0% (19%) 0% (0%) 0% (796) 

AgOTt 0% (0%) 0% (26%) 0% (0%) 0% (14%) 
CuCl 0% (0%) 2% (9%) 0% (0%) 0% (4%) 
CoCl, 0% (0%) 25% (5%) 0% (0%) 1% (5%) 


Electronic nature of carbene 


Me,S 


coun 


hydrazone to various catalysts and bases to ac- 
cess its carbene. In all cases, only o-elimination 
is observed (up to 61% 5’), likely owing to 
the high temperatures (100°C) required for 
reactivity. 

By contrast, the valuable alkyl carbene re- 
activity enabled by this approach includes 
dimerization of aldehydes that contain syn- 
thetically tractable halides, alkenes, silanes, 
carbamates, ethers, sulfides, amides, ketones, 
and o substituents, including within natural 
product and drug scaffolds (9 to 20). In all 
cases, rapid dimerization preferentially occurs 
over other types of carbene reactivity, such as 
intramolecular cyclopropanation (18). More- 
over, a robustness study (35) was conducted 


a-elimination 5’ 


2.5% FeTPPCI 
a-Me-styrene, Zn 


> 300 g of 3 (> 1 mole) 


[Photo credit: D.A.N.]. 


Carbenoid vs kety! radical pathway 


—-2V"—~ 


no catalyst 


tadical OAc 
petwray wan 
__ 
OAc 


pinacol, 64, 80% 


mk 
Ai + Me Pr 5% cat 
H—SiMe,Ph 
1 equiv each 


to further assess the broad functional group 
tolerance of this strategy, wherein chemically 
diverse additives were combined with catalyst 
and carbene precursors 1 (or 2) to explore their 
viability in the dimerization step. 

We then subjected these aldehyde-derived 
carbene precursors to several distinct classes 
of reactivity, including signature carbene 
reactions: cyclopropanation and X-H insertion 
(36, 37) (Fig. 3A). While probing reactivity with 
several common carbene traps, we observed 
that the catalysts that promoted rapid di- 
merization (vide supra; CoCl,, CuCl) are less 
efficient at chemoselective cross-coupling. 
Conversely, metallocarbenes generated from 
FeCl, or rhodium(II) acetate [Rh2(OAc)4] 


OAc HCI OAc 


re ee ics oe 


Zn carbenoid de-chlorination 
63, 99% 


standard 
conditions 


ea aa 


stilbene 4 
not observed 


catalyst major product, yield 


Cucl carbene dimer, 73% 
CoCl, carbene dimer, 80% 
FeCl, cyclopropane, 77% 
Rh,(OAc), SiH insertion, 94% 


oe 


Me 0.1 mmol, 74% 
1 mmol, 81% 
Pn 10 mmol, 81% 


0.1 mmol, 91% 
1 mmol, 99% 
10 mmol, 98% 
100 mmol, 91% 


(D) Competition experiments illustrate catalyst control of product selectivity. 
(E) The improved thermal stability of these carbene precursors relative to 
diazoalkanes allows a safer scale-up of batch chemistry. The photo shows the 
simple handling of a large quantity of 3, which is a safer alternative to 
diazomethane. Ac, acetyl; cat, catalyst; rt, room temperature; tol, tolyl. 
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enable alternate modes of carbene reactivity 
aside from dimerization. For example, cyclo- 
propanation of styrenes with aryl carbenes is 
efficiently catalyzed by FeCl, (21 and 22, up 
to 95%) but not CoCl». In contrast with typ- 
ical Simmons-Smith selectivity (38, 39), this 
Fe-catalyzed strategy is specifically selective 
for more-substituted, electron-rich alkenes 
(e.g., monoterpene 23). Conversely, cyclo- 
propanation of electron-deficient alkenes does 
not occur, as expected. However, if Me.S is 
added as a cocatalyst to access the polarity- 
reversed sulfonium ylide (see Fig. 4C) as 
has been demonstrated with diazoalkanes 
(40), then electrophilic acrylates and enones 
can now be selectively benzylated (24: and 25), 
including in selective competition over other 
alkenes, as in the case of carvone (26). This 
sulfonium-mediated strategy also affords 
epoxides and aziridines by combination of 
these aldehyde-derived carbenes with elec- 
tronically diverse aldehydes and imines (27 
to 30), without the need for diazoalkanes 
and in amore modular fashion that relies on 
aldehyde rather than typical sulfonium ylide 
precursors. 

Besides cyclopropanation, the most com- 
monly used reactivity of metal carbenes is 
catalytic X-H insertion (36, 37). Although Zn- 
carbenoids have not been useful in enabling 
such reactivity, we sought to examine if our 
metal-catalyzed activation could mirror the 
robust reactivity of diazo reagents. We found 
that a range of alkyl and aryl amines efficiently 
underwent N-H insertion to generate benzyl 
amines (31 to 33). Insertion into a less- 
polarized P-H bond also occurred with three 
classes of organophosphorus compounds, to 
access phosphates, phosphine oxides, and 
phosphine sulfides (34 to 36). Given the 
electrophilic nature of this carbene, other 
nucleophilic X-H donors were also examined. 
Si-H insertion into silanes occurred smoothly 
with either FeCl, (56% 37) or a more typical 
carbene catalyst, Rha(OAc), (83%). Insertion 
into Si-D yielded complete o-deuteration 
(>95% D, 88) in tetrahydrofuran (THF), 
confirming the carbene nature of this cata- 
lytic mechanism, which is amenable to several 
classes of alkyl] silanes (37 to 39). Insertion 
also occurred into o bonds of varying polarity, 
including B-H and S-H bonds (40 and 41). 
Overall, we observed that more-nucleophilic H 
donors (e.g., phosphine, silane, borane) are 
highly suited to this carbene reactivity, which 
has now been applied to 10 reaction classes 
(dimerization, four (2+1) cyclizations, five X-H 
insertions]. 

Because none of these alkyl carbene reac- 
tions has been previously accessible by Zn 
carbenoid activation of carbonyls, we sought 
to test the limits of this reactivity with non- 
stabilized alkyl aldehydes (Fig. 3B). First, we 
noted that typical 1,2-H migration (28, 34) did 
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not occur in this mild system at room temper- 
ature. Instead, heating (60°C) was needed to 
promote catalytic deoxygenation (42). This 
lack of background H migration enables suc- 
cessful realization of several classes of alkyl 
carbene reactivity, including cyclopropanation 
(43 and 44), epoxidation and aziridination 
(45 and 46), and o bond insertion (47 to 49), 
all with similar efficiency to aryl carbenes. 
Notably, alkyl carbene reactivity is typically 
challenging to access by other methods be- 
cause diazoalkane and dihalide precursors 
are highly prone to H migration. Having 
harnessed this new substrate class, we also 
investigated C-C insertion, wherein ring ex- 
pansion of o-cyclobutanes and rearrange- 
ment of a-cyclopentanes were each observed 
(50 to 53). Lastly, we confirmed viability of 
the simplest carbene (CH,) for both reaction 
classes (Fig. 3C): x bond cycloaddition (54 to 
56) and o bond insertion (57 and 58). We 
expect this formaldehyde-based approach to 
provide a safer alternative to conventional 
diazomethane reactivity and to complement 
modern methods that entail in situ genera- 
tion (7-9). 

To demonstrate viability in more com- 
plex settings, the aldehydes of two bile acids 
(among the most complex carbonyls dimer- 
ized in Fig. 2) were subjected to four distinct 
classes of carbene reactivity (Fig. 3D). First, 
the lithocholic acid containing both a ketone 
and aldehyde was selectively elongated at the 
aldehyde by epoxidation (59), aziridination 
(60), and Si-H insertion (61). A comparison 
with the diazo strategy was not possible be- 
cause hydrazone formation was unselective 
between the two carbonyls within this mole- 
cule. Additionally, this approach distinctively 
enables deoxygenation of an aldehyde in the 
presence of three ketones (62). 

Our mechanistic understanding of this cat- 
alytic carbene reactivity from aldehydes is based 
on a collection of experiments, including inter- 
mediate characterization, reactivity compari- 
sons, and kinetic data (Fig. 4). An investigation 
of various bases (NaH, Cs.CO3), temperatures 
(0° to 100°C), and catalysts (Fig. 4A) verifies 
that the carbene dimerization is not accessible 
via diazoalkanes. The proposed Zn carbenoid, 
generated in the absence of catalyst, was char- 
acterized by protodechlorination with HCl (63) 
(Fig. 4B). Conversely, a stronger Mn reductant 
affords pinacol coupling via ketyl radicals 
(64). This product remains unchanged and 
does not afford stilbene (4) when resubjected to 
reaction conditions. In probing the electronic 
nature of the catalytic carbene intermediate, 
we noted its electrophilicity, as evidenced by 
higher efficiency of reactivity with nucleophilic 
alkenes (a-Me-styrene; Me, methyl) versus 
electrophilic traps (acrylate) (Fig. 4C). How- 
ever, upon introduction of a sulfide cocatalyst, 
a transient sulfonium ylide (observed by gas 


chromatography-mass spectrometry) enables 
inverted reactivity with such electrophiles 
(e.g., 24), without the need for diazo inter- 
mediates (40). Furthermore, in a 1:1 competi- 
tion of an alkene and aldehyde as potential 
carbene traps, exclusive selectivity is observed 
for epoxidation with Me,S versus only cyclo- 
propanation without Me,S (Fig. 4D). Nota- 
bly, the metal catalyst also dictates reaction 
chemoselectivity. For example, in a competition 
experiment among three classes of carbene 
traps, dimerization is exclusively observed 
with CuCl (73%) and CoCl, (80%), yet FeCl, 
predominantly affords cyclopropanation (77%) 
and Rh.(OAc), yields Si-H insertion (94%). 
Lastly, in situ infrared spectroscopy of the 
carbene dimerization reaction and variable 
time normalization analysis (VINA) (41) indi- 
cate that the dimerization reaction is second 
order in catalyst (see supplementary materials). 

Because this carbene generation mechanism 
does not rely on the evolution of Nz gas, a 
substantial improvement in thermal stability 
was observed for these carbene precursors 
(AHp = 0 kJ/mol for 2 and 6 kJ/mol for 3) 
relative to diazoalkanes (AHp > 200 kJ/mol) 
(Fig. 4E). To harness this superior safety pro- 
file, large-scale batch reactions were performed 
with quantities that would be prohibitively 
unsafe with diazoalkanes (e.g., CH2Nz). In these 
cases, cyclopropanations succeeded at up to a 
1000-times-larger scale without event. 

We have introduced a catalytic strategy for 
harnessing carbene reactivity from carbonyls. 
We expect that the approach will have a three- 
fold impact on the expanded development 
of carbene reactions, including through the 
use of base metal catalysts, the use of safe and 
scalable reagents, and improved synthetic ac- 
cess to nonstabilized carbenes. 
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OCEAN OXYGEN 


Changes in North Atlantic deep-water oxygenation 
across the Middle Pleistocene Transition 


Nicola C. Thomas*, Harold J. Bradbury, David A. Hodell 


The oxygen concentrations of oceanic deep-water and atmospheric carbon dioxide (pCOz) are intrinsically 
linked through organic carbon remineralization and storage as dissolved inorganic carbon in the deep sea. We 
present a high-resolution reconstruction of relative changes in oxygen concentration in the deep North 
Atlantic for the past 1.5 million years using the carbon isotope gradient between epifaunal and infaunal 
benthic foraminifera species as a proxy for paleo-oxygen. We report a significant (>40 micromole per 
kilogram) reduction in glacial Atlantic deep-water oxygenation at ~960 thousand to 900 thousand years ago 
that coincided with increased continental ice volume and a major change in ocean thermohaline 
circulation. Paleo-oxygen results support a scenario of decreasing deep-water oxygen concentrations, 
increased respired carbon storage, and a reduction in glacial pCOz across the Middle Pleistocene Transition. 


uring the past 800 thousand years (kyr) 

glacial carbon dioxide concentrations 

(pCOz) in Earth's atmosphere aver- 

aged ~190 parts per million (ppm) by 

volume (1), lower than the preindustrial 
value by ~90 ppm (J, 2). Aside from ice core 
records, atmospheric pCO, records are frag- 
mentary but suggest that glacial pCO. may 
have been higher by 20 to 40 ppm prior to 
~1000 to 800 thousand years ago (ka) during 
the early Pleistocene (3-7). Such a drop in gla- 
cial atmospheric pCO, concentrations is one of 
the proposed causes of increased continental 
ice volume during glacial periods across the 
Middle Pleistocene Transition (MPT) (8, 9), 
which occurred between 1.25 and 0.64 million 
years ago (Ma) (0). During the MPT ice sheets 
grew larger and the duration of glacial cycles 
increased from primarily 41-kyr oscillations 
before the MPT to quasi-100-kyr cycles after- 
ward (8, 10, 11). Lowering glacial pCO. across 
the MPT most likely involved increased carbon 
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storage in the deep ocean through enhanced 
biological CO, uptake (12-14) and/or reduced 
CO, exchange between the atmosphere and 
surface ocean (12, 15, 16) especially in the 
Southern Ocean (77, 18). These processes would 
not only increase carbon storage in the deep 
ocean but would also reduce the deep-sea con- 
centration of dissolved oxygen. 

We applied a paleoproxy to estimate past 
changes in oxygen concentration (hereafter 
expressed as [O.]) in deep water of the North 
Atlantic. The proxy uses an empirical calibra- 
tion between the [O.] and the carbon isotopic 
composition difference (A&'?C) between sur- 
face and deep-dwelling benthic foraminifera, 
as first proposed by McCorkle et al. (19, 20) 
and recalibrated by Hoogakker et al. (21) (see 
supplementary materials). We report ASC as 
a proxy for North Atlantic deep-water [O.] at 
International Ocean Discovery Program (IODP) 
Site U1385 on the Iberian Margin [37°34.285’ N, 
10°7.562’ W, 2578 meters below sea level (mbsl)] 
(22, 23) (Fig. 1; supplementary materials). We 
measured the epibenthic species Cibicidoidies 
wuellerstorfi that provides a record of deep- 
water carbon isotopic composition (§C) (21, 24), 


whereas the infaunal species Globobulimina 
affinis provides a record of &'’C near the 
oxic-anoxic boundary in sediment porewaters 
(21, 24, 25). The carbon isotope gradient 
(A8™Ccip-afe) expresses the difference between 
the two. 

The main assumption of the AS’Cup a derived 
[O,] method is that the carbon isotope gra- 
dient between the sediment water interface 
and the oxic-anoxic boundary is controlled 
by deep-water [O2] and hence total dissolved 
inorganic carbon (DIC) released during aero- 
bic respiration. Addition of excess DIC through 
anaerobic processes such as nitrate reduction 
or sulfate reduction serves to lower 5"°C at the 
oxic-anoxic boundary (20, 26) and causes an 
overestimation of [O,] (26). Thus, derived 
paleo-[O,] estimates represent maximum deep- 
water oxygen concentrations. The Hoogakker et al. 
(20) calibration includes release of DIC through 
anaerobic processes as demonstrated by the 
nonzero y intercept in their empirical calibra- 
tion (see supplementary text, section 2). 

Low organic carbon concentrations (<1%) at 
Site U1385 leads to relatively low rates of or- 
ganic carbon oxidation and a deep sulfate- 
methane transition at ~50 meters below the 
seafloor (mbsf) (27). The Iberian Margin near- 
surface sediment column demonstrates low 
rates of sulfate reduction in the upper tens of 
centimeters (28), suggesting that ASC at Site 
U1385 is governed predominantly by the amount 
of aerobic respiration controlled by deep- 
water [O.]. According to recommendations of 
Jacobel et al. (26), we compare the paleo-[O2] 
estimates derived from A8"C.ip-ar¢ With other 
redox/oxygenation proxies: U/Ca, U/Mn, and 
percent C.g,0H/(C2g0H + Cog). We also as- 
sess relative changes in paleoproductivity that 
may lead to variable organic carbon fluxes 
using several proxies: ASCaip-uvig, Uvigerina 
spp. abundance, and sedimentary Ba/Al ratios 
(see supplementary text, sections 3 and 4). 

The cores from different holes at Site U1385 
were spliced to produce a composite section 
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A Oxygen (umol kg) at 2578 mbsl 


30°N 


Fig. 1. Atlantic hydrographic profiles of [02] and 5*°C relative to site 
locations. (A) deep-water [Oz] profile at a depth of 2578 mbs! at |ODP 

Site U1385 (this study; yellow diamond) in the Northeast Atlantic, along with 
the geographic position of other sites referred to in the text: North Atlantic 
DSDP Site 607 (16, 47) (pink triangle); IODP Site U1308 (68) (green circle): 


that extends from 0 to 166.5 meters composite 
depth (mcd), which is equivalent to Marine 
Isotope Stage (MIS) 47 at ~1.45 Ma. Sedimen- 
tation rates average ~11 cm kyr’ (23) which 
translates into a mean sampling frequency 
of ~290 years with no significant difference 
across the MPT. Today the water depth of 
2578 mbsl places Site U1385 in well-ventilated 
North Atlantic Deep Water (NADW) that is 
made up of varying proportions of Classical 
Labrador Sea Water (CLSW; ~45%) (see table 
S1 for [O.]), Iceland-Scotland Overflow Water 
(ISOW; ~30%), and Denmark Strait Over- 
flow Water (DSOW; ~5%), with the remainder 
(~20%) derived from the modified Antarctic 
Bottom Water (AABW) (29). Present-day [O.] at 
the site ranges between values of ~237 and 
245 wmol kg’ depending on [O.] profiles and 
databases considered (Fig. 1, fig. S1, and table 
SL) and the 8°C value of DIC averages ~1%o 
(Fig. 1). During glacial stages the relative 
proportion of northern-sourced deep water 
decreased at Site U1385 as the fraction of 
southern-sourced deep water increased (30-33). 

Hoogakker et al. (21) applied the A&™?C.jp-arr 
method to estimate [O.] for the past 160 kyr 
using piston core MD95-2042 (3146 mbsl), 
which is slightly deeper than nearby Site U1385 
but has very similar modern-day oxygen con- 
centrations (~245 wmol kg). They concluded 
that deep-water [O.] was lower during the 
last two glacial stages relative to today, with 
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the lowest values (~100 umol kg? lower) re- 
corded during cold stadial conditions associ- 
ated with Heinrich events. With Site U1385, we 
extend the [O,] record of MD95-2042 beyond 
the last glacial cycle to span the past 1.5 Ma 
(Fig. 2, A to E). We use the calibration of 
Hoogakker et ai. (21) to convert AS'Cojp_a¢e tO 
deep-water [O.] for values <2.1%o, which cor- 
respond to [0] <235 umol kg”? in which a 
strong linear relationship exists: AS’’C.ip-are = 
0.0064 x [O.] + 0.555; (R? = 0.95). Above 
235 umol kg’ the relationship breaks down, 
so we truncate estimated [O.] values at this con- 
centration (Fig. 2E). For an alternative calibra- 
tion, see supplementary materials (fig. S2) (25). 

We assess the reliability of the AS™Coip-are 
proxy using a transect of core-top sediment 
recovered from the Iberian Margin during 
Cruise JCO89 (22) (fig. SIA). Core top AS? Ceip ate 
values demonstrate that the shape generally 
follows water column [O02] with low values 
associated with Mediterranean Outflow Water 
(MOW) and higher values in NADW below 
(fig. SIB; supplementary materials). Because 
only the MOW has [0] <235 pmol kg’ we 
could only test the calibration on two core tops 
under the influence of the MOW. The differ- 
ence between CTD-measured [O.] and pre- 
dicted [O,] by using measured A&™Ceiy- are WAS 
0 and 17 pmol kg” for the deep (~1100 m) and 
shallow (~600 m) MOW core tops, respective- 
ly, which is within the estimated error of the 
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South Atlantic ODP Sites 1267 (14) (orange triangle), and Sites 1090 and 
1088 (46) (purple square and red circle, respectively). Meridional cross- 
sections showing the hydrographic profiles of (B) [02] and (C) °C of DIC 
relative to the core locations. Maps were created in Ocean Data View 
(Schlitzer, 2019; http://odv.awi.de). 


calibration (+17 umol kg’) (21) (see supple- 
mentary materials). 

From ~1435 to 960 ka during the early 
Pleistocene (MIS 47 to 27), deep-water [O.] 
was relatively high and only once fell below 
~180 pmol kg” (~1120 ka) (Fig. 2E and fig. 
S3). MIS 26 (~960 ka) marks the first time 
in the record when glacial [0] dropped to 
~140 umol kg’. Lower values were attained 
over the next three glacials (MIS 24 to 20) and 
throughout subsequent terminations before 
dropping to the lowest [O.] value (~95 pmol ke“) 
during MIS 18 (~750 ka; Fig. 2E). The interval 
from ~960 to 750 ka (MIS 26 to 18) is marked 
by notable, persistent decreases in A8Ccip-are 
(Fig. 2D), particularly throughout MIS 22 
(~890 ka) when deep-water [O.2] exhibits a 
greater number of low-[O,] events compared 
with those in the early Pleistocene. Over the 
MIS 27 to 21 (980 to 840 ka) interval, often 
referred to as the “900-ka event” (J0) or erro- 
neously as the first 100-kyr cycle (34), low-[O2] 
values are supported by other recorders of 
paleo-[O,]. High U/Ca and U/Mn (figs. S4 and 
S5B) (35, 36) in foraminifera coatings mea- 
sured in the same Site U1385 samples as those 
used for the A8"C,i-ar¢ Measurements (supple- 
mentary materials) support a drawdown in 
glacial deep-water [O,]. Furthermore, qual- 
itative indicators of low organic carbon flux 
during periods of low [O.] provide confidence 
in reconstructed depleted [O.] (26) events 
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Fig. 2. North Atlantic 
IODP Site U1385 deep 
water records (this study) 
compared with Atlantic 
meridional overturning 
circulation, ice volume, 
and pCO; estimates 
across the MPT. Site U1385 
records extend the record 

of piston core MD95-2042 (21) 
(light gray). (A) Cibicidoides 
wuellerstorfi 8!°O (dark gray); 
MIS numbers denote 
interglacials (odd) and 
glacials (even); black dashed 
boxes highlight a ~30-kyr 
hiatus and gap. (B and 

C) Benthic foraminiferal 
38C: C. wuellerstorfi 
(yellow) and G. affinis 
(green). (D) A8“°Ccip-afr 
(orange); the dashed line 

at 2.1% (orange) marks the 
calibration threshold. 

(E) [Oz] (black line with 
diamonds), truncated at 
235 umol kg, above 

which the calibration is 
unreliable; MIS numbers 
denote important glacials. 
(F) DSDP Site 607/V30-97 
Ena proxy for North 

Atlantic Ocean circulation 
changes (brown line 

with orange squares) (47). 
(G) ODP Site 1123 ice 
volume 8!8O.eawater (blue) 
(62). (H) Atmospheric pCO2 
records: Antarctic 800-kyr 
ice core data (1) (black line); 
Allan Hills Blue Ice (6, 7) 
(blue triangles and 

squares, respectively); and 
5B-based reconstructed 
pCOz [green line with 

(5), solid green squares 
(3), early Pleistocene light 
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green line and small circles, and late Pleistocene light green circles (4)]. Chronology is available in the supplementary materials, materials and methods. 


at this time (fig. S5, C to E; supplementary 
materials). 

Between ~620 and 470 ka (MIS 15 to early 
MIS 12) deep-water [O.] decreases were weaker 
and less frequent than those during the pre- 
ceding period (MIS 26 to 16) and were similar 
to those observed during the early Pleistocene 
(Fig. 2E). MIS 14 was a weak glacial period 
which may explain why oxygen remained rela- 
tively high at this time. From ~470 to 10 ka 
(late MIS 12 to the early Holocene), the mag- 
nitude of deep-water [O.] depletions intensify 
supported by the alcohol preservation proxy 
for oxygenation [C.g,0H/(C2g0H + Co9)] (33) 
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(fig. S6) but remain higher than those cen- 
tered on 900 ka. 

The relationship between changes in deep- 
water [O.] and atmospheric pCO, arises from 
the biological carbon pump through the pro- 
duction and consumption of organic matter 
(18, 37-39). Whereas many marine processes 
are involved in pCO, variations on glacial- 
interglacial time scales [see (40) for a review], 
changes in the efficiency of the ocean’s soft- 
tissue pump directly affect dissolved deep-water 
oxygen through the regeneration of respired 
carbon. Weaker overturning circulation during 
glacial stages leads to a more efficient biologi- 


cal pump (i.e., an increase in the ratio of re- 
generated versus preformed nutrient content 
in the interior ocean) (1/8) by stemming the 
ventilation “leak” of CO, from the ocean to the 
atmosphere and increasing its oceanic residence 
time (47). At the same time a stronger biological 
pump (i.e., an increased rate of organic carbon 
exported through greater productivity in the 
surface ocean) (42) will result in greater organ- 
ic carbon regeneration in the ocean’s interior, 
enhancing deep-sea respired carbon storage 
with a corresponding decrease in oxygen. Wide- 
spread reduction in deep-sea [0.2] (without com- 
pensatory change at intermediate water depths) 
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(43) implies increased carbon storage and de- 
creased atmospheric pCO, (18, 38, 44). Re- 
construction of deep-water paleo-[O.] is 
therefore a valuable tool for estimating car- 
bon transfers between the atmosphere and 
abyssal ocean, as well as for estimating glacial- 
interglacial changes in respired carbon storage 
and atmospheric pCO, (2/, 26). 

We compare estimated changes in [O.] from 
Site U1385 with other deep Atlantic pale- 
oceanographic records to determine how wide- 
spread the observed oxygen depletions were. 
Comparison with the U/Ca record from the 
Deep Sea Drilling Project (DSDP) Site 607— 
which is bathed in a water mass similar to 
that of Site U1385 (Figs. 1 and 3D)—reveals 
strong correlation between low-[O.] events 
at Site U1385 and increases in U/Ca at Site 
607 (16), suggesting that intervals of reduced 
oxygenation were widespread in the deep North 
Atlantic. Tightly coupled changes in U/Ca and 
benthic 58°C at Site 607 (fig. S7) suggest a 
link to changes in deep-water circulation (J6), 
with both proxies showing a change at MIS 26 
(~970 ka). 

Decreases in deep-water [O.] values are 
expected to mirror increases in deep-water 
[PO,° ] according to stoichiometric ratios 
involved in organic matter respiration (38, 45). 
ODP Site 1267 in the Southeast Atlantic (Fig. 1) 
provides evidence of an increase of 0.5 umol kg * 
in [PO,” ] at ~960 ka (Fig. 3E) (14) coeval with 
decreases in [O,] and decreasing P/AI ratios 
at Site U1308 in the North Atlantic (figs. S8 
and S9B; see supplementary text section 5). 
Both proxy-derived [PO,?"] at Site 1267 and 
[O,] at Site U1385 show statistically signifi- 
cant changes across the MPT (figs. S3 and S9A; 
see table S2 for statistics), suggesting a greater 
nutrient content and respired carbon pool in 
the deep Atlantic Ocean after 960 ka. 

The change in nutrient content of the deep 
South Atlantic also corresponds to a major 
change in thermohaline circulation (THC) 
inferred from changes in neodymium iso- 
topes (14, 46, 47). Neodymium isotope (€naq) 
records from North Atlantic DSDP Site 607/ 
V30-97 (Figs. 1 and 2F) (47) and ODP Sites 1267 
(14), 1088, and 1090 (46) in the Southeast 
Atlantic (Figs. 1 and 3, F and G) document 
significant reduction in NADW contribu- 
tion and/or an increased influence of AABW 
(14) between 950 and 900 ka (/4, 46). These 
THC changes occurred at times of reduced 
upwelling and degassing of southern- 
sourced deep water under expanded sea ice 
cover in the Southern Ocean, thereby con- 
tributing to decreasing benthic foramin- 
ifer 5'°C (16). 

Changes in deep-water circulation can affect 
[O.] in several ways: Preformed [O.] may 
decrease at sites of deep water formation if 
equilibration of oxygen with the atmosphere 
is reduced under sea ice leading to undersatu- 
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Fig. 3. Comparison of Atlantic Ocean deep-water oxygenation, nutrient and circulation proxies, and South 
Pacific ice volume changes through the MPT. (A) ODP Site U1385 Cibicidoides wuellerstorfi 80 (dark gray, 
this study); MIS numbers denote interglacials (odd) and glacials (even). (B) ODP Site 1123 ice volume 5°O-eawater 
(blue) (62). (C) [02] estimated from A8Capair (this study; black diamonds). MIS numbers denote important 
glacials. (D) DSDP Site 607 U/Ca redox proxy (purple diamonds) (16). (E) ODP Site 1267 Cd/Ca-derived [PO,°] 
(green squares) (14). (F and G) ODP Sites 1088 (mid-depth) and 1090 (deep) eng, respectively (46); interglacial 


(orange circles) and glacial (blue circles) maxima. 


ration in preformed oxygen (48, 49). An in- 
crease in deep water residence time results 
in decreased oxygen concentrations through 
organic matter oxidation with or without at- 
tendant changes in carbon flux from the sur- 


face (41, 50). Lastly, a change in the proportion 
of northern- versus southern-derived deep water 
masses (16, 51, 52) bathing Site U1385 can affect 
[O.]; for example, NADW has [O.] values that 
are ~50 pmol kg higher than those of AABW 
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and 75 umol kg’ higher than Circumpolar 
Deep Water (Fig. 1 and table S1) (53). 

Changes in source areas of NADW for- 
mation could substantially affect [O.] in the 
deep North Atlantic. CLSW is a well-ventilated 
water mass that distributes oxygen through- 
out the entire North Atlantic (54) including 
Site U1385 on the Iberian Margin that today 
is bathed by ~45% CLSW (29). Oxygen satu- 
ration of CLSW is highly sensitive to win- 
ter conditions in the Labrador Sea through 
bubble-mediated air-sea transfer associated 
with intensive winds, cooling, and deep con- 
vection (55). The Labrador Sea has been 
described as a “trapdoor” through which the 
flux of oxygen ventilates much of the deep 
Atlantic basin (55). During Heinrich stadials 
of the last glaciation, the Labrador Sea was 
covered by extensive sea ice which would have 
reduced the ventilation of the deep Atlantic 
by CLSW (21, 56). Sea ice also expanded over 
the Nordic Seas (57-59) (source areas of 
DSOW and ISOW) (53) and the Irminger Sea 
where these water masses mix to form NADW 
(53, 60, 61). 

The decrease in glacial deep-water [O.] at 
~960 ka coincides with an increase in conti- 
nental ice volume and lowered sea level, in- 
ferred from changes in 5'°O of seawater at 
Site 1123 in the Southwest Pacific (57, 62, 63) 
(Figs. 2G, 3B, and 4A). Available evidence sug- 
gests Northern Hemisphere glaciation in- 
tensified during MIS 22 [(57) and references 
therein], which resulted in changes to Atlantic 
deep-water circulation (52). 
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The timing of the initial decrease in deep- 
water [O.] at ~960 ka during MIS 26 is asso- 
ciated with a short-lived excursion to very 
negative €nq values at Site 607 (Fig. 2F) (47). 
This event is interpreted as a result of exten- 
sive weathering and erosion of the North 
American craton between MIS 27 to 25 (~980 
to 950 ka) and the possible transition from 
terrestrial- to marine-terminating ice sheets 
in the Northern Hemisphere (64). MIS 26 also 
represents the first time the polar front shifted 
to a zonal position south of Site 980/981 in 
the Northeast Atlantic (55.5°N, 14.7°W) as in- 
ferred from an increase in the percent of the 
polar planktonic foraminifera Neogloboquadrina 
pachyderma (percent NPS), a proxy for cold 
surface temperatures (fig. S10) (65, 66). These 
precursor changes heralded the major changes 
in ice volume, thermohaline circulation, and 
pCO, associated with the “900-ka event” 
during MIS 24 to 22 in less highly resolved 
records than those of Sites U1385 and 980/981 
(3, 4, 6, 10, 14, 46, 47, 62, 65). 

The deep-water oxygen depletions may have 
been driven by increased freshwater inputs 
and ice rafting to the source areas of NADW 
formation. The lowest [O.] values during the 
last glacial cycle are associated with cold mil- 
lennial events in the North Atlantic (fig. S11B) 
(21). Millennial [O.] depletion events occur 
throughout the entire record and are most 
common toward the latter part of glacial cycles 
and during terminations. These times are also 
associated with enhanced sea ice extent in the 
North Atlantic (67), cold stadial periods, and 


increases in ice-rafted detritus (IRD) at both 
the North Atlantic Site U1308 (68) (Fig. 1 and 
figs. S11A and S12) and the Labrador Sea Site 
U1302/03 (fig. S13) (69). A close connection 
exists today between North Atlantic deep-water 
oxygenation and winter surface conditions 
in the Labrador, Irminger, and Nordic Seas 
(55-58). We suggest that enhanced surface 
stratification and reduced deep-water convec- 
tion in NADW source areas may have caused 
episodic reduction of North Atlantic deep- 
water ventilation, leading to reduced deep- 
water oxygen and increased carbon storage (70). 
Incursions of glacial southern-sourced deep 
water into the North Atlantic associated with 
weaker NADW production (51, 52, 61, 71, 72) 
could also partly account for [O.] changes 
observed at Site U1385 over the past 1.5 Ma 
given the [O.] difference (~50 umol kg™?) be- 
tween northern and southern-sourced deep 
water today (Fig. 1 and table S1). 

The changes in deep-water [O2] at Site U1385 
could equally be driven by surface processes 
in the Southern Ocean such as changes in 
productivity (13), surface stratification (15, 73), 
vertical mixing, and sea ice extent. These 
changes would be transmitted to the deep 
sea through an expanding southern-sourced 
deep water mass such as Lower Circumpolar 
Deep Water (74) in which oxygen was con- 
siderably reduced during the last glacial 
period (26, 36, 44). Reduction in ventilation 
due to circulation changes, Southern Ocean 
stratification, and sea ice expansion would 
have contributed to the inferred increase 
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in deep-sea carbon storage over the MPT 
(14-16, 18, 73, 75-78). 

Decreased glacial deep-water [O,] and in- 
creased deep-sea carbon storage across the 
MPT have implications for atmospheric pCO,. 
The A8”’C.in-are Proxy Of [Os] at Site U1385 is 
consistent with changes in pCO, measured 
directly in ice cores and blue ice in Antarctica 
(, 6, 7) and inferred indirectly from boron 
isotope data (3-5) (Fig. 4B). Prior to ~900 ka, 
minimum glacial pCO, values were ~24 ppm 
higher than they were afterward (3, 5-7). If 
applied to the whole Atlantic Ocean influenced 
by an expanding southern-sourced deep water 
mass, Hoogakker e¢ al. (21) estimated at least 
15% of the reduction in atmospheric pCO, 
during the last glacial maximum (LGM) could 
be accounted for by the increases observed in 
their respired carbon pool. Furthermore, over 
the same period, the Pacific Ocean is reported 
to have had old, oxygen-depleted deep-water 
(26, 43, 44). The reconstructed glacial [O2] 
values reported here for the MPT are at least 
20 umol kg! lower than the LGM (20), imply- 
ing greater carbon storage than during the 
LGM. It remains to be seen from other basins 
whether this was a global phenomenon. 

In summary, inferred changes in North 
Atlantic deep-water [O.] for the past 1.5 Ma 
reveal a significant (>40 umol kg’) reduc- 
tion in glacial deep-water [O.] at ~900 ka sug- 
gesting increased storage of respired carbon, 
which is consistent with a drawdown of glacial 
atmospheric pCO, values (Fig. 4B) (/, 3, 4, 6, 7). 
The inferred change in [O,] is supported by 
trace metal records (J6) and nutrient (/4) proxy 
records in other Atlantic sites associated with 
a critical change in glacial THC at ~900 ka (46) 
(Fig. 3, C to G). The close association between 
[O.] depletions and IRD events (68, 69, 79) 
suggests that increased stratification and sea 
ice cover in NADW source regions (56-59) re- 
duced the oxygen supply to much of the deep 
North Atlantic (55). In addition, northward 
expansion of southern-sourced deep water 
into the North Atlantic and processes in the 
Southern Ocean (e.g., productivity, surface 
water stratification, vertical mixing, and sea 
ice extent) also contributed to reduced venti- 
lation associated with a major change in deep- 
water circulation (46). Our results support a 
set of internally consistent changes in Atlantic 
deep water beginning at ~960 ka across the 
MPT, which included a decrease in oxygen 
concentrations, increased nutrient concen- 
trations (14), and storage of respired carbon 
that led to a reduction in glacial pCO, (3-7, 47) 
and an associated increase in global ice vol- 
ume (62). 
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A conserved Bacteroidetes antigen induces 
anti-inflammatory intestinal T lymphocytes 
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The microbiome contributes to the development and maturation of the immune system. In response 
to commensal bacteria, intestinal CD4* T lymphocytes differentiate into functional subtypes with 
regulatory or effector functions. The development of small intestine intraepithelial lymphocytes 
that coexpress CD4 and CD8a0 homodimers (CD4IELs) depends on the microbiota. However, the 
identity of the microbial antigens recognized by CD4* T cells that can differentiate into CD4IELs 
remains unknown. We identified B-hexosaminidase, a conserved enzyme across commensals of 

the Bacteroidetes phylum, as a driver of CD4IEL differentiation. In a mouse model of colitis, 
B-hexosaminidase-specific lymphocytes protected against intestinal inflammation. Thus, T cells of a 
single specificity can recognize a variety of abundant commensals and elicit a regulatory immune 


response at the intestinal mucosa. 


he microbiota contributes to functional 

specification of adaptive immunity, both 

through direct interactions and through 

soluble mediators released into the envi- 

ronment (7-3). Colonic bacteria such as 
Helicobacter hepaticus promote differentiation 
of antigen-specific CD4" T cells into Foxp3* reg- 
ulatory T cells (Tyeg cells) in the colon, whereas 
segmented filamentous bacteria (SFB) induce 
quasi-clonal pro-inflammatory T helper 17 
(Ty17) cells in the ileum (4-0). Such inter- 
actions are not only specific for the bacterial 
species concerned but, depending on location 
and context, can also influence T cell fates 
(4, 11). Because fate decisions are made at the 
clonal level, different T cell receptor (TCR) 
specificities ought to drive distinct develop- 
mental and functional outcomes. 
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Two subsets of CD4* T lymphocytes are 
known to regulate adaptive immunity at the 
intestinal mucosa: peripherally induced regu- 
latory T cells (pTyeg cells) and CD8aa-expressing 
intraepithelial lymphocytes (CD4IELs) (72-6). 
Depletion of CD4IELs aggravates intestinal 
inflammation in mice that lack expression of 
Foxp3 (14). CD4IELs and pTyeg cells thus co- 
operate in the regulation of local intestinal 
inflammation. In specific pathogen-free (SPF) 
mice, CD4IELs are present mostly in the small 
intestinal epithelium (SIE). Their abundance 
varies with age, diet, and microbiota (fig. S1, A 
and B, and data SI, A and B) (/7, 18). CD4TELs 
can develop in a microbiota-dependent man- 
ner, either from conventional CD4* T cells or 
from Tyee cell precursors (JI, 14). Germ-free 
(GF) mice have few if any CD4IELs (fig. S1C) 
(14, 19). Mice treated with antibiotics show a 
similar decline in CD4IELs (fig. S1, D and E) 
(14, 19), whereas fecal transplants from SPF 
mice restore the CD4IEL population in GF 
mice to varying degrees (fig. SIF) (17, 18). The 
microbiota therefore contributes not only to 
the development but also to the maintenance 
of CD4IELs. However, the microbial antigens 
recognized by CD4IELs are unknown. Most 
IEL populations, including CD4IELs, display a 
restricted TCR repertoire (17, 20, 27). A limited 
array of antigens might thus suffice to shape 
this compartment. We used a microbiota- 
specific transnuclear (TN) monoclonal T cell 
model (7) to identify naturally occurring TCR 
ligands of CD4IELs in SPF mice (fig. S2A). 
T cells from the TN mouse carry a monoclonal 
TCR, cloned from a pTyeg cell obtained from a 
mesenteric lymph node (mLN) of a SPF mouse 
(11). Naive monoclonal T cells isolated from 
TN mice populate not only the recipient’s mLNs, 
where they differentiate into pTyeg cells, but 


also the small intestinal epithelium, where they 
develop into CD4JELs in a microbiota-dependent 
fashion (11). 

We thus sought to identify the commensal 
member(s) recognized by TN T cells by se- 
lectively growing fecal bacteria derived from 
Taconic “excluded flora” (EF) mice, which 
carry a microbiota enriched for the antigen 
recognized by the TN TCR (/1). TN T cells 
proliferated strongly in the presence of bac- 
terial extracts derived from Bacteroides bile 
esculin (BBE) agar plates or aerotolerant 
bacteria recovered from Schaedler blood 
agar (SBA) plates. 16S ribosomal RNA (rRNA) 
sequencing showed enrichment for opera- 
tional taxonomic units (OTUs) correspond- 
ing to the Parabacteroides genus in activating 
extracts (Fig. 1, A to C, and data S1C). TN 
T cells proliferated robustly in the presence 
of Parabacteroides goldsteinii extracts (Fig. 1D), 
the predominant member of the altered 
Schaedler flora (ASF) (22). This proliferation 
was dependent on antigen presentation by 
dendritic cells (fig. S2B). Neither the related 
Parabacteroides distasonis nor any other spe- 
cies tested induced proliferation of TN T cells 
in vitro (Fig. 1D and fig. S2C). Next, we trans- 
ferred CD4* TN T cells into congenic recipients, 
which were then immunized with bacterial 
extracts. Proliferation of TN cells occurred 
in the draining lymph nodes in response to 
P. goldsteinti extract but not P. distasonis 
extract or phosphate-buffered saline (Fig. 1E). 
Thus, P. goldsteinti, a commensal abundant 
in Taconic mice but less so in Jackson (Jax) 
mice (fig. S3, A to C, and data SI, D and E), 
can engage the TN TCR in vivo. TN T cells 
failed to proliferate and differentiate into 
CD4IELs in Ragr! ~ recipients treated with 
antibiotics or when colonized with unrelated 
bacteria (Clostridium tertium) but expanded 
and differentiated into CD4IELs in mice col- 
onized with P. goldsteinii (Fig. IF and fig. S3D). 
Thus, P. goldsteinii promotes the in vivo de- 
velopment of CD4IELs from naive T cells 
carrying the TN TCR. 

We next tested whether P. goldsteinii could 
induce CD4IELs in wild-type (WT) mice with 
a polyclonal T cell repertoire. SPF mice obtained 
from Jax, which naturally harbor few CD4IELs 
(fig. SIA), developed more CD4IELs upon colo- 
nization with P. goldsteinii (Fig. 1G). By contrast, 
colonization with the unrelated commensal 
SFB, also absent in Jax mice (10), did not boost 
CD4IEL frequencies in these mice (fig. $3, E 
and F). Monocolonization of GF animals 
with P. goldsteinii was not sufficient to induce 
CD4IELs, however (fig. S4A). One possible 
explanation is that monocolonized mice have 
low tissue expression of interferon-y (IFNy) 
and lack class II major histocompatibility 
complex (MHCII) products on intestinal 
epithelial cells (fig. S4, B to D), both of which 
are required for CD4IEL development (20, 23). 
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Fig. 1. P. goldsteinii induces CD4IELs in both monoclonal and polyclonal 
SPF mice. (A) Schematic of the in vitro proliferation assay. (B) 16S rRNA 
sequencing of isolated bacterial extracts used in (C). (C) Carboxyfluorescein 
diacetate succinimidy! ester (CFSE) dilution among TN cells in response to the 
indicated bacterial extracts derived from Taconic EF fecal bacteria, isolated by 
using the indicated growth conditions. (D) CellTrace Violet dilution among CD4* 
TN T cells in response to the indicated bacteria. (C) and (D) are representative 
of flow cytometry dot plots of three experiments. (E) Representative flow 
cytometry dot plots showing CellTrace Violet dilution (left) and frequency of 
divided cells (right) among CD4* TN T cells harvested from the draining inguinal 
LNs of mice immunized with the indicated bacterial extracts (n = 2 to 6 mice 
per group). (F) Ragl~’~ hosts were treated with antibiotics (ABX) then either 
colonized or not colonized with the indicated bacteria before receiving CD4* 
TNT cells. Cells from the SIE were analyzed by means of flow cytometry (n = 2 to 


Thus, P. goldsteinii promotes the accumula- 
tion of CD4IELs from both TN and WT pre- 
cursors in SPF mice. 

We next fractionated P. goldsteinii lysates to 
identify potential TCR ligands derived from 
this species (fig. S5, A to D). This procedure 
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yielded fractions that activated TN T cells 
in vitro (fig. SSE). Liquid chromatography 
with tandem mass spectrometry (LC-MS/MS) 
analysis of fraction 17 identified 33 proteins, 
the top 15 of which were expressed recombi- 
nantly in Escherichia coli (table S1). Their 
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5 mice per group). (Top) Experimental design. (Middle) Representative flow 
cytometry dot plots of Va2 and V6 expression among CD45* cells and CD4 and 
CD8a. expression among TN cells (Va2"VB6") in SIE (bottom). (Right) Frequency 
of (top) TN cells among CD45" cells and (bottom) CD4IELs (CD4*CD8aa") in the 
SIE among TN CD4* T cells in all mice analyzed in (F). (G) WT Jax mice were treated 
with ABX and colonized with P. goldsteinii. Cells from the SIE were analyzed by 
means of flow cytometry (n = 6 to 10 mice per group). (Top) Experimental design. 
(Bottom) Representative dot plots showing (left) Foxp3 and CD8a expression 

and (right) frequency of CD4IELs among CD4* T cells. The graphs in (F) and (G) 
show the means + SEM, and each symbol indicates a mouse from two to three 
experiments. P values were calculated with unpaired two-tailed Student's t test in 
(E) and (F) and two-tailed Student's t test with Welch correction in (G). Tac, Taconic; 
CR, Charles River; SBA, Schaedler blood agar; BBE, Bacteroides bile esculin; CNA, 
colistin-nalidixic acid. 


sonicates served as a source of candidate 
antigens tested in the proliferation assay. We 
identified §-N-acetylhexosaminidase (f-hex) 
as the protein recognized by the TN TCR 
(Fig. 2A). TN T cells proliferated in mice 
immunized with protein extract of E. coli 
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Fig. 2. The TN TCR recognizes epitopes from Bacteroidetes p-N- 
acetylhexosaminidase in complex with I-A®. (A) Candidate polypeptides 
identified with mass spectrometry were recombinantly expressed in E. coli, and 
bacterial extracts were tested in vitro for their ability to activate TN T cells. 
(Left) Representative dot plots. (Right) Frequency of divided TN T cells in the 
presence of tested extracts. Graph shows means of two technical replicates. 
(B and C) In vivo proliferation of CD4* TN T cells in response to 25 ug of 
bacterial extracts [(B), n = 3 to 7 mice per group] or 2 ug of peptide [(C), 
n= 5 mice per group)]. Mutations in the core epitope are indicated in red. (D) (Left) 
Same as (A), but by using the indicated peptides at concentrations of 

500 nM to 50 pM in serial 10-fold dilutions. (Right) Alignment of sequences 
homologous to the TN epitope. “Activity” is the ability of each peptide to 
induce proliferation of TN T cells in vitro (left). (E and F) CD45.1* recipients 


expressing P. goldsteinii B-hex but not when 
immunized with f-hex derived from closely 
related species (Fig. 2B). By analyzing the 
activity of truncated versions of the protein, 
we defined a ~70-residue stretch that contained 
the cognate epitope of the TN TCR. Using over- 
lapping peptides, we identified YKGSRVWLN 
as the minimal epitope (fig. S6, A to D). The cor- 
responding B-hex peptide from Parabacteroides 
merdae failed to stimulate TN T cells, even 
though it bound to I-A® (Fig. 2C; fig. S6, E to 
G; and table $2). Thus, the P. goldsteinii 
YKGSRVWLN f-hex epitope is a natural lig- 
and of the TN TCR. Homologous f-hex pep- 
tides derived from Bacteroides vulgatus and 
Bacteroides luti also induced proliferation of 
TN cells in vitro, but only at higher concen- 
trations (Fig. 2D). Using a combination of 
BlastP and Jackhmmer analyses (24), we found 
putative B-hex immunostimulatory sequences 
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in many Bacteroidetes members—the most 
abundant phylum found exclusively in the 
gastrointestinal tract (25, 26)—as well as in more 
distantly related species, such as Spirosoma 
panaciterrae (fig. S7). Thus, the TN TCR likely 
recognizes an even broader collection of bac- 
terial species that belong to the Bacteroidetes 
phylum. 

Colonization of SPF mice with P. goldsteinii 
promoted the expansion and development 
of CD4IELs and pTyg cells from TN CD4* 
T cells. Likewise, B. vulgatus, which also en- 
codes a TN B-hex epitope (Fig. 2D), supported 
the expansion and development of TN CD4IELs 
(Fig. 2, E and F, and fig. S8, A to C). By contrast, 
colonization of mice with a B-hex-deficient 
strain of B. vulgatus failed to do so (fig. S8, 
D to H). Despite similar expression of B-hex 
along the intestinal tract (fig. S9, A and B), 
TN cells transferred into SPF mice proliferated 


were treated with ABX and received CD45.2°CD4* TN T cells. The mice were then 
either colonized or not colonized with the indicated 
per group). Four weeks after colonization, (E) small intestine intraepithelial 
lymphocytes and (F) mLNs were analyzed. Dot plots show CD8a and 

Foxp3 expression, and graphs show the frequency of CD4IELs (CD4*CD8aa") 
in the (E) SIE and (F) mLN Tyeg cells (Foxp3*) among TN CD4* T cells. In vitro 
experiments are representative of three experiments [(A) and (D)]. Dot plots 
for in vivo experiments show one representative mouse, and the graphs show 
all mice analyzed in two experiments. Each symbol represents a mouse, 

and graphs show means + SEM [(B), (C), (E), and 
calculated with one-way analysis of variance (ANOVA) with Dunnett's post hoc 
test in (B), (E), and (F) or unpaired two-tailed Student's t test in (C). 

P.g., P. goldsteinii; P.d., P. distasonis; P.m., P. merdae. 


bacteria (n = 7 or 8 mice 


(F)]. P values were 


predominantly in the jejunum/ileum-draining 
mLN (fig. S9, C to E). Thus, the B-hex epitope 
is presented in the distal mLN and is nec- 
essary for the differentiation of CD4IELs from 
TN precursors. 

To determine whether this prevalent B-hex 
epitope is also recognized by intestinal CD4* 
lymphocytes of WT SPF mice, we designed 
P. goldsteinii B-hex-MHCII tetramers spanning 
the B-hex epitope (table S3) and enumerated 
B-hex-specific CD4* T cells by means of flow 
cytometry (Fig. 3, A to C, and fig. S10, A to L). 
Splenic CD4* T cells showed equally low 
staining with control and f-hex tetramers, 
whereas B-hex-specific cells were readily 
identified in both gut-draining mLNs and small 
intestinal epithelium (Fig. 3, A to C, and fig. 
$10, A to F). The frequency of B-hex-specific 
cells was significantly higher among MLN Treg 
cells of mice housed at both Boston Children’s 
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Fig. 3. The 6-hexosaminidase epitope is recognized by intraepithelial 
lymphocytes and Treg cells. (A) Cells from spleen, mLNs, and the SIE were 
harvested from SPF mice housed at Boston Children’s Hospital. Cells from 
CD45.1* WT mice and CD45.2* TN mice were mixed at 10:1 (WT:TN) ratio and 
stained with B-hex tetramers. Gates show the frequency of tetramer* CD45.1* or 
CD45.2* cells among CD4* T cells. (B) Representative dot plots of mLNs from 
Tac SPF mice (n = 13) stained with the indicated tetramers (left). Frequencies of 
tetramer” cells in the indicated populations among CD4* T cells isolated from 
mLNs (right). (C) Tetramer analysis among SIE CD4* T cells of WT mice sourced 
from the indicated facilities, cohoused or treated orally with B-hex peptide as 
indicated (n = 5 to 8 mice per group). (D) Quantification of P. goldsteinii 
abundance by means of quantitative polymerase chain reaction in fecal samples 


derived from the mice shown in (C). (E to G) scRNA-seq analysis of B-hex- 
specific (B-hex) and B-hex” CD4* T cells sorted from SIE of three Taconic 
SPF mice. (E) UMAP plots according to gene expression analysis showing the 
distribution of conventional T cells (Tconv, CD103°CD8«.), PrelEL (CD103*CD8qr ), 
and CD4IELs (CD103*CD8a"*) according to the index sorting. (F) Volcano plot 
comparing the gene expression of B-hex* and B-hex™ cells. (G) Expression 

of selected genes by B-hex* and B-hex™ cells. All mice shown of [(E) to (G)] 
two, (A) three, (B) five, or [(C) and (D)] six experiments. Error bars indicate 
means + SEM. P values were calculated with paired two-tailed Student's 

t tests in (B), one-way ANOVA with Dunnett’s post-hoc test in (C) and (D), 
and Wilcoxon rank sum test in (F) and (G). Jax, Jackson; Tac, Taconic; co-h, 
co-housed. 


Hospital and Rockefeller University animal 
facilities (Fig. 3B and fig. S10G). In mLNs, 
B-hex-specific cells accounted for up to 4% 
of all CD25*CD4" T cells, a population highly 
enriched for Tyeg cells (Fig. 3B) (27). Despite 
the intrinsically reduced ability of CD4IELs 
to bind MHCII tetramers (fig. S1OH), 40% of 
B-hex tetramer” cells in the epithelium were 
CD4IELs (fig. S10, I and J). The frequencies 
of B-hex-specific CD4* T cells varied by mouse, 
with age, and even between rooms within the 
same facility, ranging from 1 to 20% in the 
epithelium and 0.2 to 2% in the mLN (fig. S10, 
C to F, I, and K). We then examined the extent 
to which the commensal-derived B-hex antigen 
shapes the epithelial lymphocyte compartment 
(Fig. 3C and fig. SIOL). In Jax mice, which 
carry low frequencies of CD4IELs (fig. SIA) 
and are virtually devoid of P. goldsteinii (fig. 
83, A and B), the proportion of B-hex-specific 
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IELs was lower than in Taconic mice (Fig. 3, 
C and D). Taconic mice had high frequencies 
of CD4IELs and were natural carriers of in- 
testinal bacteria encoding the B-hex antigen, 
including P. goldsteinii and other Bacteroidetes 
(fig. SIE, and fig. S3, A to C). Similarly, Jax 
mice cohoused with Taconic mice were effi- 
ciently colonized with P. goldsteinii and showed 
a higher proportion of f-hex-specific IELs, close 
to that observed in the Taconic mice themselves 
(Fig. 3, C and D). Treatment of Jax mice with 
B-hex peptide slightly increased the frequencies 
of B-hex-specific IELs, although this difference 
did not reach statistical significance (Fig. 3C). 
Thus, the P. goldsteinii B-hex epitope is a 
prominent natural T cell receptor ligand 
for MLN T,,.g cells and intraepithelial CD4* 
T cells. We conclude that a microbiota rich 
in P. goldsteinii supports the expansion of 
commensal-specific IELs, using as example the 


B-hex antigen derived from abundant intesti- 
nal commensals such as Parabacteroides and 
Bacteroides spp. 

To define the transcriptional signature of 
B-hex-specific IELs, we index-sorted B-hex* 
and B-hex IELs from Taconic mice (fig. S11A) 
and profiled them by means of single-cell 
RNA-sequencing (scRNA-seq) using the SMART- 
Seq2 methodology (28). This approach enabled 
us to analyze the gene expression profile of 
commensal-induced B-hex-specific IELs at 
homeostasis. We identified two major clusters 
visualized with uniform manifold approxima- 
tion and projection (UMAP) (Fig. 3E and 
fig. S11B). Cluster 0 contained mostly Cd8a 
and Itgae (encoding CD103)-expressing cells, 
and a gene signature typical of CD4IELs, 
which includes both cytotoxic and regulatory 
profiles as defined by expression of granzymes 
and Ctla4 (20, 29). Cluster 1 was composed 
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Fig. 4. B-hexosaminidase-specific T cells confer protection against intesti- sections from indicated groups. (H) Histology score of H&E-stained sections 
nal inflammation. (A to D) Taconic SPF Rag2’~ either received or did not of colon. (I to L) Same as (A) to (D) but by using either WT (n = 5 mice) or 


receive CD45.2°CD4* TN T cells (day 0) and naive CD45.1*CD4* T cells from 
WT mice (day 12). On days 2, 4, and 13, some mice received B-hex peptide orally 
(n = 8 to 10 mice). (A) Percent of initial weight over time. (B) Fecal lipocalin 
(LCN-2) levels throughout the experiment (n = 5 to 8 mice). (C) Representative 
photomicrographs of hematoxylin and eosin (H&E)-stained colonic sections 
from indicated groups. (D) Histology score of H&E-stained sections of colon. 

(E to H) Same as (A) to (D) but with either WT (n = 7 mice) or Foxp3°'® 

(n = 10 mice) CD4* T cells (on day 12 after transfer of TN T cells). These mice 
received DT intraperitoneally once per week throughout the experiment. (E) 
Percent of initial weight over time. (F) Fecal LCN-2 levels throughout the 
experiment. (G) Representative photomicrographs of H&E-stained colonic 


Foxp3’“- TN CD4* cells (n = 5 mice, day 0). (I) Percent of initial weight over time. 
(J) Fecal LCN-2 levels. (K) Representative photomicrographs of H&E-stained 
sections of colon of indicated groups. (L) Histology scores of H&E-stained 
sections of colon. (A) to (L) show all mice analyzed in two experiments. Graphs 
show means + SEM. P values were calculated with two-way ANOVA for 
repeated measures with the Tukey post hoc test in (A), (B), (E), (F), (I), and 
(J); Mann-Whitney test in (H); and Kruskal-Wallis test followed by Dunn's multiple 
comparisons post-hoc test in (D) and (L). Scale bar, (C) and (G) (top) 200 um 
and (bottom) 60 um; (K) (top) 200 um and (bottom) 100 um. Low- and high- 
magnification photomicrographs show two representative mice in each group 

in (C), (G), and (K). ns, not significant. 


predominantly of Cd8a™ IELs, including pre- 
IELs (fig. S11C) (20, 29). B-hex* and B-hex™ 
IELs were largely indistinguishable from each 
other and were evenly distributed among the 
two clusters (Fig. 3E and fig. S11, D to F). There 
were no differentially expressed genes between 
B-hex* and B-hex” IELs, and both subsets ex- 
pressed typical IEL markers (Fig. 3, F and G, 
and fig. S11, E to G), associated with both 
regulatory (Ctla4) and cytotoxic (Gzmb) pro- 
files. We observed a slight enrichment of B-hex* 
IELs in cluster 1 compared with cluster 0 (Fig. 3, 
F and G, and fig. SID). This enrichment may 
have been related to the reduced ability of 
CD4IELs to bind MHCTI-tetramers (fig. S10H), 
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presumably because of the inhibitory interac- 
tions between the thymus leukemia (TL) anti- 
gen on epithelial cells and CD80 on CD4IELs 
(30). Thus, the gene expression profile of B-hex- 
specific IELs is representative of all IELs. Con- 
sequently, B-hex likely represents one of many 
possible specificities that lead CD4* T cells to 
acquire an IEL phenotype after initial priming 
by a commensal antigen. 

Although CD4IELs express IFNy and gran- 
zyme B, they display a regulatory phenotype— 
characterized by the expression of interleukin-10 
and Lag3 (Fig. 3G) (1, 20, 29)—and can protect 
against intestinal inflammation (/4-16). To 
test the anti-inflammatory potential of B-hex- 


specific T cells, we transferred CD4* TN T cells 
into immunodeficient Raga! ~ mice and then 
induced colitis through WT CD4* T cells (fig. 
$12A) (32). WT cells expanded in the intestine, 
irrespective of whether TN cells were present 
or not (fig. S12B). TN cells efficiently differentiated 
into CD4IELs (fig. S12C) but poorly differentiated 
into pTieg cells (fig. S12D), similar to what has 
been observed for other commensal-specific 
T cells transferred into Ragr! ~ recipients 
(32). TN cells are found predominantly in the 
small intestine and mLNs (JJ), but in this 
colitis model, TN cells were also present in 
the large intestine, where they acquired ex- 
pression of CD8aa (fig. $12, B and C). Mice 
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that received TN cells were partially protected 
from colitis induced by colitogenic WT cells. 
We observed reduced weight loss, improved 
histopathological scores, and lower levels of 
fecal lipocalin-2 (LCN-2), a marker of intesti- 
nal inflammation (Fig. 4, A to D) (33). Oral 
administration of B-hex peptide—a procedure 
that expands TN cells and leads to their dif- 
ferentiation into CD4IELs (fig. S12E)—further 
reduced intestinal inflammation in recipients 
of TN and WT CD4* T cells (Fig. 4, B to D). 
Similarly, mice that received only colitogenic 
WT cells and oral f-hex peptide also exhibited 
decreased intestinal inflammation and increased 
frequency of total CD4IELs in the small intestine 
(Fig. 4B and fig. $12C). At an earlier time point, 
we also observed an increased frequency of total 
CD4IELs as well as of B-hex-specific IELs, which 
was accompanied by a decreased frequency of 
Treg Cells in the epithelium (fig. S12, F to H). The 
frequency of WT-derived T,.¢ cells was low in all 
groups, suggesting little contribution by pTyeg 
cells to such protection (fig. S12D), as has been 
observed previously (34). To definitively ex- 
clude a role for WT Tyeg cells, we induced 
colitis by transferring T cells from Foxp3?'® 
mice and depleted T,, cells with diphtheria 
toxin (DT) (Fig. 4, E to H, and fig. $12, I to J). 
Depletion of T,.g cells in this colitis model 
leads to severe weight loss, intestinal inflam- 
mation, and death (35). Despite this deple- 
tion, TN T cells still protected mice against 
intestinal inflammation. Although mice lost 
weight because of toxicity associated with re- 
peated injections of DT (36, 37), there was no 
difference between the two groups (recipients 
of Foxp3?'® versus Foxp3™" naive T cells) 
(Fig. 4, F to H). Similar to WT T,,., cells, the 
proportion of TN T cells that differentiated 
into T,eg cells was low but not null in this 
model (fig. S12D). To evaluate a possible 
contribution of TN-derived T,.¢ cells to the 
anti-inflammatory response induced by B-hex- 
specific TN T cells, we ablated Foxp3 directly 
in pT reg cell TN/Ragl /~/Foxp3°*"® zygotes 
by using CRISPR/Cas 9 genome editing. We 
generated mice with either a 1-base pair (bp) 
deletion or 1-bp insertion frameshifts in exon 8 
of the Foxp3 gene (fig. S13, A to C). We adop- 
tively transferred splenic CD4* T cells from 
these Foxp3-deficient TN mice into Raga! ~ 
recipients and then induced colitis by using 
WT naive CD4* T cells as outlined above (fig. 
S13D). Foxp3-deficient TN T cells (fig. S13E) 
conferred partial protection against intestinal 
inflammation (Fig. 4, I to L). Thus, the anti- 
inflammatory effect exerted by TN T cells in 
this model can occur in the absence of Foxp3 
expression. Foxp3-deficient and -sufficient TN 
T cells were equally able to expand in all 
intestinal sites surveyed (fig. S13F). In this 
colitis model, TN T cells differentiated into 
CD4IELs (up to 80%) and secreted IFNy and 
granzyme B, which are hallmarks of IELs, 
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irrespective of the presence of a functional 
Foxp3 gene in the TN donors (fig. S13, G to 
I). Thus, B-hex-specific T cells that migrate 
to, expand within, and differentiate into 
CD4IELs in response to commensal antigens 
can protect against intestinal inflammation. 

Commensal bacteria shape the differentia- 
tion and function of intestinal T cells (J), in- 
cluding CD4IELs. We show that P. goldsteinii 
and other Bacteroidetes can induce antigen- 
specific differentiation of CD4IELs. It is possi- 
ble that a fraction of B-hex-specific pT,g cells 
generated in the mLNs migrate to the epithe- 
lium. There, they can further differentiate into 
preIELs or CD4IELs after reencountering their 
cognate antigen (J4, 20). Because CD4" T cells 
require local engagement with MHCII to dif- 
ferentiate into CD4IELs (20), recognition of an 
antigen such as B-hex, present across a range of 
abundant commensals, may provide CD4IEL 
precursors with a competitive advantage to 
populate the intestinal epithelium. The same 
B-hex-specific TCR rearrangement found in 
the TN strain is present among CD4IELs and 
their epithelial precursors in mice that carry 
only the VB6 chain of the TN TCR and a 
polyclonal TCRa chain (20). Although we iden- 
tified B-hex as a natural ligand of intra- 
epithelial CD4* T cells, it is likely that other 
commensal-derived antigens are also recog- 
nized by these cells. Because activation of 
CD4IELs is attenuated by interactions between 
TL antigen expressed by intestinal epithelial 
cells and CD8aa on IELs (38), antigen abun- 
dance and TCR affinity may play a role in the 
development of CD4IELs. In addition, com- 
mensals such as P. goldsteinii and B. vulgatus 
may not only provide TCR ligands but also 
metabolites essential for the creation of an 
environment that supports the development 
and maintenance of regulatory IELs. In this 
work, we showed that B-hex-specific T cells 
that reside in the epithelium can protect against 
intestinal inflammation. The TCB-titratable reg- 
ulation imposed by the interaction between the 
TL antigen and CD8ao (38) may control the 
anti-inflammatory and cytotoxic functions of 
CD4IELs. Deciphering the rules that govern 
the mutualism between commensals, patho- 
bionts, and immune cells is essential for a better 
understanding of homeostasis and inflamma- 
tion at the intestinal mucosa. 
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MUSCLE REPAIR 


JMJD3 activated hyaluronan synthesis drives muscle 
regeneration in an inflammatory environment 
Kiran Nakka'’, Sarah Hachmer*“t, Zeinab Mokhtari'2}, Radmila Kovac’, Hina Bandukwala’, 


Clara Bernard’, Yuefeng Li?”, Guojia Xie*, Chengyu Liu®, Magid Fallahi?, Lynn A. Megeney*?, 
Julien Gondin®, Bénédicte Chazaud®, Marjorie Brand", Xiaohui Zha*’, 


Kai Ge’, F. Jeffrey Dilworth’?** 


Muscle stem cells (MuSCs) reside in a specialized niche that ensures their regenerative capacity. 
Although we know that innate immune cells infiltrate the niche in response to injury, it remains unclear 
how MuSCs adapt to this altered environment for initiating repair. Here, we demonstrate that 
inflammatory cytokine signaling from the regenerative niche impairs the ability of quiescent MuSCs to 
reenter the cell cycle. The histone H3 lysine 27 (H3K27) demethylase JMJD3, but not UTX, allowed 
MuSCs to overcome inhibitory inflammation signaling by removing trimethylated H3K27 (H3K27me3) 
marks at the Has2 locus to initiate production of hyaluronic acid, which in turn established an 
extracellular matrix competent for integrating signals that direct MuSCs to exit quiescence. Thus, 
JMJD3-driven hyaluronic acid synthesis plays a proregenerative role that allows MuSC adaptation to 


inflammation and the initiation of muscle repair. 


uscle stem cells (MuSCs) provide skel- 
etal muscle with an efficient mode for 
regeneration of damaged tissue (1). 
After injury, the regenerative process 
is initiated by necrosis of damaged 
myofibers and the release of myokines that 
instruct recruitment of various tissue-resident 
and infiltrating cell types that coordinate mus- 
cle repair (2). Tight control of signal integra- 
tion from the regenerative milieu promotes 
the expansion of muscle progenitors to medi- 
ate both myofiber repair and stem cell-niche 
repopulation (3, 4). Recent work identified a 
population of infiltrating macrophages that 
establish a transient niche necessary for quies- 
cent MuSCs to reenter the cell cycle (5). How- 
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ever, it remains unknown how MuSCs adapt 
to this modified niche to initiate regeneration. 

In response to injury, MuSCs undergo a 
stress response that is accompanied by alter- 
ation of the epigenetic landscape (6). Among 
the different epigenetic changes induced by 
injury in MuSCs, global levels of trimethylated 
histone H3 lysine 27 (H3K27me3) are reduced 
as stem cells transition from a quiescent to a 
proliferative state (7). Removal of H3K27me3 
marks is mediated by the KDM6 family of 
H3K27 demethylases, which includes JMJD3 
and UTX proteins (8, 9). Although H3K27me3 
marks are tightly linked to transcriptional 
repression (10, 11), the importance of active 
H3K27me3 removal to tissue development 
and repair has been called into question be- 
cause mouse embryos that lack both JMJD3 
and UTX survive to term (12). However, re- 
generation after injury is linked to inflamma- 
tion, which differs from tissue development. 
Here, we examined the importance of H3K27 
demethylation by JMJD3 and UTX for MuSC 
adaptation to the regenerative niche of injured 
muscle. 

The Utx and Jmjd3 genes are ubiquitously 
expressed, and we observed that both proteins 
are present in quiescent MuSCs (fig. S1). To 
determine how each H3K27 demethylase con- 
tributes to regeneration, we generated tamoxifen- 


inducible MuSC-specific knockouts of Utx 
(UTX®®°) or Jmjd3 (JIMJD3*®°) in mice (fig. 
$2). Upon acute cardiotoxin (CTX) injury of 
the tibialis anterior (TA) muscle, both UTX***° 
and JMJD3°*° mice showed impaired myo- 
fiber regeneration, demonstrating that KDM6 
proteins cannot compensate for each other in 
MuSC-mediated regeneration (Fig. 1A and fig. 
$3). To investigate the specific roles for UTX 
and JMJD3, single-cell RNA sequencing (scRNA- 
seq) was performed on purified MuSCs that 
were isolated 40 hours after injury. Pseudo- 
time trajectory analysis showed that JMJD3°*° 
MuSCs were enriched in clusters that expressed 
an immediate-early MuSC activation gene 
signature, whereas UTX*®® cells proceeded 
to proliferate (Fig. 1, B and C; and fig. S4). 
These findings suggested that JMJD3 might 
be required for efficient transitioning of MuSCs 
from a quiescent to a proliferating state. In- 
deed, using the thymidine analog 5-ethynyl-2'- 
deoxyuridine (EdU) to identify cells undergoing 
DNA replication, we observed that MuSCs 
that lacked JMJD3 were impaired in their 
progression into the cell cycle (Fig. 1D). Thus, 
JMJD3, but not UTX, is required for efficient 
activation of MuSCs. At the same time, we ob- 
served that UTX, but not JMJD3, is required 
for MuSCs to exit the cell cycle and undergo 
terminal differentiation (fig. $5) (13). Thus, 
JMJD3 and UTX play nonredundant roles in 
MuSC-mediated regeneration. 

To determine whether JMJD3 was acting 
through its enzymatic activity, a mutant mouse 
was generated that expresses an enzyme-dead 
JMJD3 protein (JMJD3°”) (fig. $6, A and B). 
Similar to JMJD3°°*° cells, MuSCs that ex- 
pressed mutant JMJD3 were inefficient at exit- 
ing quiescence (Fig. 1D) and, as a result, showed 
impaired regeneration after injury (fig. S6, C 
and D). This demonstrates that JMJD3 func- 
tions through H3K27 demethylation to fa- 
cilitate the exit of quiescence for MuSCs in 
injured muscle. 

Next, we used ex vivo experiments to ex- 
plore the mechanism through which JMJD3 
contributes to the activation process. Using 
MuSCs purified from uninjured mice, we were 
surprised to find that JMJD3°°**° MuSCs re- 
entered the cell cycle to the same extent as 
wild-type MuSCs (Fig. 2A and fig. $7). Thus, in 
the absence of muscle injury, JMJD3 was not 
necessary for cell cycle reentry. This indicated 
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that JMJD3-mediated removal of H3K27me3 
may be required for MuSCs to integrate sig- 
nals from the regenerative environment. To 
test this, we performed TA muscle injury in 
one leg of JMJD3°*° mice and then waited 
36 hours before measuring cell cycle reentry 
of MuSCs isolated from the contralateral leg. 
MuSCs lacking JMJD3 were impaired in their 
ability to reenter the cell cycle when recover- 
ing from a distal injury (Fig. 2A and fig. S8). 
Thus, circulating factors that originate in the 
injured muscle must be responsible for pre- 
venting MuSCs that lack JMJD3 from reenter- 
ing the cell cycle. A soluble extract prepared 
from an injured wild-type muscle (dMusEx) 
could also inhibit cell cycle reentry of purified 
MuSCs that lack JMJD3 (Fig. 2B and fig. S9), 
which shows that the circulating factors that 
inhibit MuSC activation are released from 
injured muscles independent of JMJD3. Thus, 
JMJD3 facilitates MuSC activation in a non- 
cell autonomous manner by overcoming an 
inhibition signal that originates in the regen- 
erative niche. Using an extract prepared from 
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mice with dystrophic muscle (MdxMusEx), we 
also found that JMJD3 is required for MuSC 
activation in conditions of chronic degeneration 
(Fig. 2B), highlighting the importance of JMJD3 
for mediating activation of MuSCs in muscular 
dystrophy. 

To identify JMJD3 target genes in MuSCs, we 
used a combination of genomics and tran- 
scriptomics (Fig. 3, figs. S10 and S11, table S1, 
and supplementary text). We defined JMJD3 
direct targets based on three properties: (i) 
decreased expression in JMJD3°°“° MuSCs, 
(ii) JMJD3 binding within 2500 base pairs 
of the promoter, and (iii) accumulation of 
H3K27me3 marks near the gene upon abla- 
tion of JMJD3. Of the 41 genes that met these 
criteria (table S1), we focused on those coding 
for extracellular matrix (ECM)-related pro- 
teins because they may facilitate communica- 
tion with the regenerative milieu. Has2 (Fig. 3) 
was of particular interest because it encodes 
a key enzyme involved in the synthesis of 
hyaluronic acid (HA), a glycosaminoglycan 
polymer that incorporates into the ECM to 


Fig. 1. JMJD3 and UTX play nonredundant roles 
in muscle regeneration where JMJD3 is required 
for injury-induced activation of satellite cells. 
(A) Hematoxylin and eosin staining of TA muscle 
cross sections from wild-type (WT), JMJD3°°“°, and 
UTXS°K° mice at 7 days after injury (CTX). Regener- 
ation was quantified by measuring myofiber diame- 
ter. Data are means + SD, and N = 3. ****p < 
0.0001, and **p < 0.01 [by analysis of variance 
(ANOVA)]. Scale bars are 60 um. CSA, cross- 
sectional area. (B) Clustering and trajectory analysis 
of combined scRNA-seq data from lineage marked 
MuSCs [isolated based on TdTomato expression 
(TdT*)] from the WT, JMJD3°°K°, and UTXS*° mice 
isolated 40 hours after injury. Numbers indicate 
distinct MuSC clusters identified. The ovals highlight 
the enrichment of cells in the immediate-early 
activated MuSC state. (C) Distribution of MuSCs in 
clusters representative of different stages of the 
regenerative process. (D) Activation of MuSCs was 
measured by using in vivo EdU incorporation to 
measure the first passage of cells through the 

S phase of the cell cycle between 24 and 40 hours 
after injury. Fluorescence-activated cell sorting 
(FACS) analysis identified MuSCs (TdT*) that 

were positive for EdU. Data are means + SD, and 

N = 3; ***p < 0.001, **p < 0.01, and ns is not 
significant (by Student's t test). i.p., intraperitoneally. 


alter rigidity (4) while acting as a rheostat for 
various signaling pathways (15). By reanalyz- 
ing published datasets (16, 17), we found that 
Has? is not expressed in quiescent MuSCs (fig. 
$12, A and B) but is up-regulated in injured 
muscle as MuSCs reenter the cell cycle. In 
regenerating muscle 40 hours after injury, 
HA enveloped the activated MuSCs of wild- 
type or dystrophic mice while also contri- 
buting to the ECM of the necrotic myofibers 
(Fig. 4A and fig. S13, A to D). MuSCs that 
lacked JMJD3 were devoid of cell-enveloping 
HA, even though HA was present in the ECM 
of necrotic myofibers (Fig. 4A and fig. S13). 
This dynamic incorporation of HA into the 
ECM of MuSCs suggested that it may play an 
important role in regeneration of healthy 
muscle. To test this, we used the chemical 
inhibitor 4-methylumbelliferone (4-MU) to 
inhibit HA production (78) in wild-type MuSCs. 
Whereas purified wild-type MuSCs (from un- 
injured mice) were able to efficiently exit 
quiescence in the presence of 4-MU (Fig. 4B 
and fig. S14), the same MuSCs incubated with 
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Fig. 2. JMJD3 mediates MuSC activation in 

a non-cell autonomous manner. (A) Purified 
MuSCs (TdT*) from injured (CTX injury to 
contralateral leg 36 hours before MuSC isolation 
from uninjured muscles) or uninjured mice 

were assayed in vitro using EdU for cell cycle 
reentry between 24 and 40 hours after injury. 

FACS analysis identified MuSCs (TdT*) that were 
positive for EdU. Data are means + SD, and 

N = 3; ***p < 0.001, **p < 0.01, and ns is not 
significant (by Student's t test). dpi, days postinjury. 
(B) Ex vivo activation assay where flexor digitorum 
brevis (FDB) myofiber—-associated MuSCs were 
treated with soluble extracts prepared from 
uninjured (ContMusEx), injured (dMusEx), or dys- 
trophic (MdxMusEx) muscle. Immunofluorescence 
analysis identified MuSCs (TdT*) that were 

positive for EdU. Data are means + SD, and 

N = 3; ***p < 0.001, **p < 0.01, and ns is not 
significant (by Student's t test). Scale bars are 50 um. 


Fig. 3. Has2 is a direct transcriptional 
target of JMJD3. UCSC (University of 
California Santa Cruz) browser track 
showing the Has2 locus in MuSCs at 
30 hours after CTX injury. Plots include 
RNA-seq data for WT, JMJD3°°K°, 

and UTX°*“° mice; CUT&Tag analysis 
shows enrichment of JMJD3, UTX, 

and H3K27me3. 


an extract from damaged muscle showed im- 
paired cell cycle reentry upon 4-MU treatment 
(Fig. 4B and supplementary text). Thus, HA 
synthesis is required for wild-type MuSCs 
to repair muscle after injury. Addition of 
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exogenous HA to cultures was sufficient for 
JMJD3**° MuSCs to overcome an inhibition 
to cell cycle reentry (Fig. 4C and figs. S15 to 
S17), further emphasizing the importance of 
HA incorporation into the ECM of MuSCs to 


initiate population expansion in response to 
signals from the regenerative niche. 

Finally, we explored the nature of the niche- 
derived signals that are responsible for im- 
peding cell cycle reentry of MuSCs that lack 
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Fig. 4. HA incorporation into the ECM 

of MuSCs allows exit of quiescence. 

(A) Immunofluorescence analysis of TA 

muscle cross sections at 72 hours after CTX 
injury from WT and JMJD3°°%° mice. MuSCs 
(TdT*), HA [hyaluronic acid binding protein 
(HABP)], and nuclear DNA [4',6-diamidino-2- 
phenylindole (DAPI)] are shown. Scale bars are 
75 um. (B) Activation of MuSCs from WT 

mice was measured in the presence of 4-MU 

by in vitro EdU incorporation. An extract from 
damaged WT mouse muscle (dMusEx) was 
added as indicated. FACS analysis identified 
uSCs (TdT*) that were positive for EdU. 

Open circles indicate that the variable was not 
included in the experimental samples of this 
condition; filled circles indicate that the variable 
was included in the experimental samples of 
this condition. Data are means + SD, and N = 3; 
**p < 0.01, *p < 0.05, and ns is not significant 
(by Student's t test). (C) Ex vivo activation assay 
of myofiber-associated MuSCs. FDB myofibers 
isolated from uninjured mice were incubated with 
either a damaged muscle extract (dMusEx) or 
recombinant IFN-y protein. HA was added to 
cultures where indicated. Immunofluorescence 
analysis identified MuSCs (TdT*) that were 
positive for EdU. Open circles indicate that the 


variable was not included in the experimental samples of this condition; filled circles indicate that the varia 
. Data are means + SD, and N = 3; ***p < 0.001, **p < 0.01, and ns is not significant (by Student's t test). 


of this condition 


JMJD3. Gene set enrichment analysis (GSEA) 
of RNA sequencing (RNA-seq) data identified 
the interferon-y (IFN-y), interleukin-6 (IL-6), 
and tumor necrosis factor-a (TNF-a) cellular 
responses as being up-regulated in MuSCs 
from JMJD3°*° mice (fig. S18). Using recom- 
binant cytokines, addition of exogenous IFN-y 
or IL-6 [but not TNF-a, transforming growth 
factor-B (TGF-B), or IL-4] to the culture me- 
dium was sufficient to impair cell cycle reentry 
of JMJD3°**° MuSCs (Fig. 4C, figs. S19 and 
$20, and supplementary text). Macrophages 
(Ly6C* cells) that are present within the re- 
generative niche at 24 hours after injury ex- 
press both IFN-y and IL-6 (fig. S21) and likely 
contribute to the signaling that impairs activ- 
ation of MuSCs. This proinflammatory signal- 
ing is likely reinforced by neutrophils that are 
known to secrete IFN-y and IL-6 in regenerat- 
ing muscle (19). We propose that proinflam- 
matory cytokines (including IFN-y and IL-6) 
produced by immune cells from the regen- 
erative environment play a critical role in 
preventing MuSCs from undergoing an 
untimely exit of quiescence. To overcome the 
cytokine-mediated block to stem cell function, 
MuSCs initiate Has2 expression in a JMJD3- 
dependent manner, where incorporation of 
HA into the remodeled ECM renders the stem 
cells competent to receive proregenerative 
signals. Thus, HA has both anti-inflammatory 
effects on injured tissues (20) and proregener- 
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ative activity that stimulates MuSC-mediated 
regeneration. 

Our study has revealed a role for the epi- 
genetic enzyme JMJD3 in directing MuSC 
adaption to the regenerative niche by facilitat- 
ing expression of JMJD3 target gene Has2. 
The resulting production of HA allows MuSCs 
to integrate proregenerative signaling from 
the local environment to facilitate the repair 
of injured muscle. 
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QUANTUM DYNAMICS 


Observation of a continuous time crystal 


Phatthamon Kongkhambut!, Jim Skulte’?, Ludwig Mathey*, Jayson G. Cosme®, 


Andreas Hemmerich'*, Hans KeBler™* 


Time crystals are classified as discrete or continuous depending on whether they spontaneously 
break discrete or continuous time translation symmetry. Although discrete time crystals have been 
extensively studied in periodically driven systems, the experimental realization of a continuous time 
crystal is still pending. We report the observation of a limit cycle phase in a continuously pumped 
dissipative atom-cavity system that is characterized by emergent oscillations in the intracavity photon 
number. The phase of the oscillation was found to be random for different realizations, and hence, 
this dynamical many-body state breaks continuous time translation symmetry spontaneously. 
Furthermore, the observed limit cycles are robust against temporal perturbations and therefore 
demonstrate the realization of a continuous time crystal. 


ime crystals are dynamical many-body 

states that break time translation sym- 

metry in a spontaneous and robust man- 

ner (J, 2). The original quantum time 

crystal envisaged by Wilczek involves a 
closed many-body system with all-to-all cou- 
pling that breaks continuous time translation 
symmetry by exhibiting oscillatory dynamics 
in its lowest-energy equilibrium state even 
though the underlying Hamiltonian is time- 
independent (7). This would constitute a star- 
tling state of matter in motion, fundamentally 
protected from bringing this motion to a 
standstill through energy removal. However, 
a series of no-go theorems have shown that 
nature prohibits the realization of such time 
crystals in isolated systems (3-5). The search 
for time crystals was thus extended to include 
equilibrium scenarios in periodically driven 
closed systems (6-8). This has led to realiza- 
tions of discrete time crystals, which break the 
discrete time translation symmetry imposed 
by the external drive (9-17). In such discrete 
time crystals, during a short initial phase, the 
drive slightly excites the system until the sys- 
tem decouples from the drive, so that further 
energy or entropy flow is terminated. The sys- 
tem develops a subharmonic response, an in- 
trinsic oscillation at a frequency slower than 
that of the drive. Initially, it was argued that 
dissipation, and hence the use of open sys- 
tems, must be carefully avoided; then, so-called 
dissipative discrete time crystals were theoret- 
ically predicted (18) and experimentally realized 
(19-21). As shown in a number of theoretical 
works (22-24), the use of open systems comes 
with the unexpected consequence that contin- 
uous instead of periodic driving suffices to 
induce time crystal dynamics. These contin- 
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uous time crystals (CTCs) realize the spirit of 
the original proposal more closely than discrete 
time crystals and circumvent the no-go theo- 
rems through their open character. 

Here, we report the observation of a CTC in 
the form of a limit cycle phase in a continu- 
ously pumped dissipative atom-cavity system 
(Fig. 1A). In classical nonlinear dynamics, the 
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term “limit cycle”, coined by Poincaré in a math- 
ematical context (25), denotes a closed phase 
space trajectory, asymptotically approached by 
at least one neighboring trajectory. Although 
limit cycles are well-established in classical 
nonlinear physics (26), there are two essential 
conditions for limit cycles in open quantum 
systems to form a CTC. First, the formation of 
the limit cycle must be associated with spon- 
taneous breaking of continuous time translation 
symmetry. That is, the relative time phase of the 
oscillations for repeated realizations takes ran- 
dom values between 0 and 2r. Second, the limit 
cycle phase is robust against temporal perturba- 
tions of technical or fundamental character, 
such as quantum noise and, for open systems, 
fluctuations associated with dissipation. The 
characteristic signature of the CTC presented 
here is a persistent oscillation of the intra- 
cavity intensity and atomic density (Fig. 1, B 
and C), which complies with the robustness 
and spontaneous symmetry-breaking criteria 
(Fig. 1D). 

Our experimental setup consists of a Bose- 
Einstein condensate (BEC) of N, = 5 x 10* ®’Rb 
atoms inside a high-finesse optical cavity. The 
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Fig. 1. CTC in an atom-cavity system. (A) Schematic drawing of the atom-cavity system pumped 
transversely with an optical pump lattice, blue detuned with respect to an atomic transition. (Inset) The 
photon field (blue) and the atomic density (red) of the limit cycle dynamics, based on simulations. The 
blue color shading of the time axis indicates the intracavity photon number. (B) Single experimental 
realization of the limit cycle phase for Se/2n = -3.8 kHz and ef = 1.25 E;ec. The vertical dashed black line 
indicates the start of the 10 ms holding time, during which the pump strength is held constant. The 

black line indicates the time trace of the pump strength e, and the blue line indicates the time evolution of 
the intracavity photon number Np(t). (©) Normalized and rescaled single-sided amplitude spectrum of Np 
calculated from the data shown in (B). (D) (Top) Distribution of the time phase in the limit cycle phase 


f 
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l Seq/2n = -5.0 kHz and e; = 1.25 E;ec. The error bars indicate the phase uncertainty within our discrete 


Fourier transform resolution of 100 Hz. However, the uncertainty with regard to the radial dimension—the 
amplitude uncertainty—is negligibly small. For clarity, we removed the errors bars, around 30%, which 
are overlapping. (Bottom) The evolution of the intracavity photon number for two specific experimental 


realizations, marked with “ 


"and “2” at top, which have a time phase difference of almost z. 
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system is transversely pumped with a standing 
wave field with a wavelength i, = 792.55 nm 
(Fig. 1A). This wavelength is blue detuned with 
respect to relevant atomic D, transition of 
®’Rb at a wavelength of 794.98 nm. The cavity 
operates in the recoil resolved regime (27)—its 
field decay rate « = 2x x 3.4 kHz is smaller 
than the recoil frequency @,e. = 20 x 3.7 kHz. 
The cavity resonance frequency @, is shifted 
because of the refractive index of the BEC by 
an amount of 6_ = N,U /2, where Up = 2n x 
1.3 Hz is the maximal light shift per intracavity 
photon. We define the effective detuning as 
See = Se — 6-, Where 5, = Mp — We is the detuning 
between the pump field frequency m, and the 
resonance frequency of the empty cavity @,¢. 
To determine the regime of the CTC, we 
measured the time dependence of the intra- 
cavity photon number p(t) that emerges in 
the protocol given below. We show Np(é) in 
Fig. 2A and two derived quantities, the crys- 
talline fraction = and the limit-cycle frequency 
@rc, in Fig. 2, B and C, respectively. In our 
protocol, the intracavity photon number Np(z) 
was recorded as we linearly ramped the pump 
strength e from 0 to 3.5 E,.. within 10 ms, 
while keeping Seg, fixed. Initially, for weak 
pump intensities, the BEC phase was stable, 
and Np was zero. Above a critical value of e, 
the BEC became unstable toward the forma- 


tion of a self-organized superradiant phase 
heralded by a nonzero Np. This represents a 
many-body state as the cavity photons medi- 
ate a retarded infinite-range interaction be- 
tween the atoms. Although this superradiant 
phase transition has been intensively studied 
for a red-detuned pump (28-37), it has only 
been realized recently for a blue-detuned pump 
after its theoretical prediction (32, 33). For blue 
detuning, the atoms are low-field seeking and 
localize at the intensity minima of the light 
field. Nevertheless, the atoms can still self- 
organize into the superradiant phase, as evi- 
dent from the large blue areas shown in Fig. 2A. 
However, the self-organized superradiant 
phase may become unstable for higher pump 
strengths because it costs energy for the atoms 
to localize away from the nodes of the pump 
lattice. This behavior leads to the disappearance 
of the self-organized phase for higher pump 
strengths (32). A phase diagram in fig. S1 in 
(34) shows a larger range of ¢, demonstrating 
the disappearance of the self-organization for 
strong pumping. In the recoil-resolved regime, 
because of the retarded character of the cavity- 
mediated interaction, we additionally observed 
the emergence of a new dynamical phase or a 
limit cycle phase characterized by self-sustained 
oscillations of Np as the atoms cycled through 
different density wave patterns (33, 35). The 
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Fig. 2. Determining the time-crystalline regime. (A) (Top) Pump strength 
protocol. (Bottom) The corresponding intracavity photon number Np as a 
function of Se and e. The area enclosed by the yellow dashed lines indicates 
the parameter space spanned in (B) and (C). (B) Relative crystalline fraction 
= and (C) limit cycle frequency wm ¢ plotted versus Ser and e;. To obtain (B) 
and (C), for fixed Sef, the pump strength is ramped to its final value e¢ and 
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resolution of the experimental imaging system 
is insufficient to observe the real-space density 
of the cloud; instead, simulations of the evo- 
lution of the single-particle density by use of a 
mean-field model are shown in fig. S3 (34). Phys- 
ically, the limit cycles can be understood as a 
competition between opposing energy contri- 
butions: one coming from the pump lattice 
potential, and another coming from the cavity- 
induced all-to-all interaction between the atoms 
(33). In the superradiant phase, the cavity- 
induced interaction energy dominates, and the 
atoms localize at the antinodes. In the limit 
cycle phase for sufficiently strong pump inten- 
sities, localization of low-field-seeking atoms 
at the antinodes becomes energetically costly, 
resulting in a decrease in the density modu- 
lations and Np as the system attempts to go 
back to the normal homogeneous phase. How- 
ever, this is unstable toward self-organization 
because the chosen pump strength already ex- 
ceeds the critical value, and thus, the cycle 
starts anew. The regime of recoil-resolution 
of the cavity, in which the dynamics of the 
atomic density and the light field evolve with 
similar time scales, has turned out to be the 
key ingredient to realize the limit cycle phase. 
This can be understood by noting that the 
delayed dynamics of the cavity field, with 
respect to the atomic density, leads to cavity 
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subsequently held constant for 10 ms. The relative crystalline fraction = and the 
corresponding value of wc identify the time-crystalline state. The parameter 
space is divided into 20 by 24 plaquettes and averages across 5 to 10 
experimental implementations are produced. The white cross indicates the 
parameter values Ser/2n = -5.0 kHz and ef = 1.25 Exec. The white area in (C) 
corresponds to data with = below 1/e. 
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cooling, which in contrast to broadband cavity 
setups restricts the atoms to occupy only a 
small number of momentum modes. This pre- 
vents the system from heating up and enter- 
ing chaotic dynamics. We observed the limit 
cycle phase in the region shown in Fig. 2A 
enclosed by the yellow dashed lines. To fur- 
ther highlight the dynamical nature of this 
phase, we show a typical single-shot realiza- 
tion in Fig. 1, B and C. 

Next, we quantitatively identified the area 
in the parameter space, spanned by the pump 
strength e and the effective detuning der, 
where limit cycles can be observed. For fixed 
Ser, We linearly ramped e to the desired final 
value e,, using the same slope as for the 
measurement shown in Fig. 2A, and held € 
constant for 10 ms. The protocol is depicted 
by the black curve in Fig. 1B. We show in 
Fig. 1C an example of the normalized and 
rescaled single-sided amplitude spectrum 
Np() = Np(@)/Npmax(@ic) obtained from 
N>(é) within the holding time window [0,10] ms 
in Fig. 1B. Np(o) is the normalized single- 
sided amplitude spectrum, and Np max (@rc) is 
the maximum value of the measured limit 
cycle amplitude. In the case of pronounced 
limit cycle dynamics as in Fig. 1C, the single- 
sided amplitude spectrum shows a distinct 
peak, with a width associated with the limit 
cycle lifetime of several milliseconds. The 
narrowest peaks observed exhibit a e? width 
Aq@ = 2n x 14 kHz: The limit cycle frequency 
@rtc, plotted in Fig. 2C, is defined as the 
frequency of the dominant peak in the single- 
sided amplitude spectrum within the frequency 
interval Arc = [3.5,15.5] x 2x Hz, chosen much 
larger than 8; € [@yc - A®/2,0,¢ + Aw/2]. The 
oscillation frequency of a CTC is not necessarily 
fixed, and robustness refers to the persistence of 
the CTC in the thermodynamic limit and for a 
wide range of system parameters [finite-size 
effects are discussed in the supplementary 
materials (34)] (22). We calculated a common 
measure for time crystallinity, the crystalline 
fraction =’ (10, 11), as the ratio between the 
area under the single-sided amplitude spec- 
trum within 6; and the total area within Ay.c. 
That is, = = oi eV? (O)/ Soca, AVP (). The 
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relative crystalline fraction = shown in Fig. 
2B is normalized to the maximum crystalline 
fraction measured in the parameter space 
explored in this work. Because of the finite 
lifetime of the BEC, it is difficult to access 
the long-time behavior of the system, which 
makes it experimentally challenging to dis- 
tinguish between the areas of stable limit 
cycle, chaos, and possible transient phases. 
Hence, we define a cut-off or threshold value 
for the relative crystalline fraction, = oy, = 1/e, 
to identify regions with observable limit cycle 
dynamics. In Fig. 2C, the frequency response 
of the limit cycle phase is only shown if its 
relative crystalline fraction is higher than the 
cut-off value: = > =... The experimental life- 
time of our time crystal is limited by atom 
loss. Furthermore, the short-range contact 
interaction, due to collisions between the 
atoms, leads to dephasing of the system and 
hence melting of the time crystal. Simulations 
that include contact interactions and phenom- 
enological atom loss can be found in the sup- 
plementary materials. 

The spontaneous symmetry breaking of a 
many-body system indicates a phase transition. 
We demonstrated strong evidence that the 
limit cycle phase emerges through spontaneous 
breaking of continuous time translation sym- 
metry, and thus, it is a CTC. We repeated the 
experimental pump protocol shown as the Fig. 
1B black line more than 1500 times with fixed 
Sef/2n = -5.0 kHz and es = 1.25 E,... These 
parameter values are indicated in Fig. 2C with 
a white cross. Because of technical instabil- 
ities, the number of the atoms in the BEC N, 
fluctuates by 5%. This leads to a fluctuating 
value of 5.4 and hence of wy. Pictorially, this 
can be understood by observing that fluctua- 
tions in N, effectively shift the CTC regime in 
Fig. 2C either up or down. For the parameter 
values indicated by a white cross in Fig. 2C, 
the median of @¢ iS @pc = 2m x 9.69 kHz. 
Our discrete Fourier transform resolution, 
set by the 10-ms time window, is 100 Hz. Thus, 
we only considered experimental runs, which 
yielded response frequencies of wpc = @pc + 
2n x (50 Hz). For each single-shot measure- 
ment, we obtained the time phase defined as 
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pump strength e. (Bottom) 
Corresponding dynamics of Np. 
(C and D) Single-sided amplitude 
spectra of (A) and (B), respec- 
tively. (E) Relative crystalline 
fraction for varying noise strength 
nand fixed 8,/2n = -5.0 kHz 
and ef = 1.25 Erec. 
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the principal argument arg[Vp(@;,c)] of the 
Fourier transformed intracavity photon num- 
ber Np(@rc) evaluated at the limit cycle fre- 
quency @y,¢. In Fig. 1D, we show the distribution 
of the observed time phases, which randomly 
covers the interval [0,21]. This corroborates the 
spontaneous breaking of continuous time 
translation symmetry in the limit cycle phase. 
In the bottom of Fig. 1D, we show two specific 
experimental realizations, which have a time 
phase difference of almost x. Simulations rep- 
resenting the BEC as a coherent state show a 
range of the response frequency distribution 
of 300 Hz. Because we post-selected our data 
far below this limit, the origin of the spread 
over 2x in the time phase distribution is not 
due to technical noises but rather to quantum 
fluctuations. In the supplementary materials, 
we show a more detailed theoretical analysis to 
support this argument. The error bars along the 
angular direction in Fig. 1D indicate the phase 
uncertainty within 100 Hz of our Fourier limit. 
The average phase uncertainty is around 
0.25n. The uncertainty in the radial direction 
corresponding to the oscillation amplitude is, 
however, negligible. Moreover, we removed 
30% of the error bars for clarity in Fig. 1D. 
Last, we demonstrated the robustness of the 
limit cycle phase against temporal perturbations, 
which is a defining feature of time crystals. We 
introduced white noise onto the pump signal 
with a bandwidth of 50 kHz. The noise strength 
is quantified by n= | An oisy(@)|/ 
pee wal |Aciean()| =1; where Anoisy and 
Acean are the single-sided amplitude spec- 
trum of the pump in the presence and absence 
of white noise, respectively. We chose the pa- 
rameters Seg/2n = -5.0 kHz and €¢ = 1.25 Exee 
in the center of the stable limit cycle region, 
indicated by the white cross in Fig. 2C, and 
added white noise with varying strengths. In 
Fig. 3, A and B, top, single-shot realizations of 
the noisy pump signal are shown for weak 
and strong noise, respectively. The correspond- 
ing dynamics of Np is shown in Fig. 3, A and B, 
bottom. In Fig. 3E, we show how increasing 
the noise strength can “melt” the CTC as in- 
ferred by the decreasing relative crystalline 
fractions calculated from single-sided amplitude 
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spectra, similar to those shown in Fig. 3, C 
and D. The system takes time to react to the 
noise, so that a few oscillations can always 
be observed before decay sets in. This leads 
to an offset of 0.4 in the crystalline fraction, 
even for very strong noise. Nevertheless, we 
found that the limit cycle phase indeed ex- 
hibits robust oscillatory behavior over a wide 
range of the noise strength. This, together 
with the observation of spontaneous breaking 
of a continuous time translation symmetry, 
suggests that the observed limit cycle phase is 
a CTC. 

We have experimentally demonstrated a 
CTC and provided a theoretical understand- 
ing. This class of dynamical many-body states 
expands the concepts of long-range order and 
spontaneous symmetry breaking into the time 
domain and is therefore of fundamental in- 
terest. This result, and the precision and con- 
trol achieved with our atom-cavity platform, 
paves the way toward a broad and compre- 
hensive study of dynamical many-body states 
of bosonic or fermionic quantum matter in the 
strongly correlated regime. For example, an 
increased atom-photon coupling could gener- 
ate a new class of time crystals associated with 
symmetry-broken periodic entanglement. Fur- 
thermore, technological applications, such as 
toward time metrology, can be envisioned. 
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CRYSTALLIZATION 


Synchronous assembly of chiral skeletal 
single-crystalline microvessels 


Osamu Oki!, Hiroshi Yamagishi", Yasuhiro Morisaki?, Ryo Inoue”, Kana Ogawa”, Nanami Miki’, 
Yasuo Norikane*?, Hiroyasu Sato*, Yohei Yamamoto!* 


Skeletal or concave polyhedral crystals appear in a variety of synthetic processes and natural 
environments. However, their morphology, size, and orientation are difficult to control because of their 
highly kinetic growth character. We report a methodology to achieve synchronous, uniaxial, and stepwise 
growth of micrometer-scale skeletal single crystals from planar-chiral double-decker molecules. Upon 
drop-casting of a heated ethanol solution onto a quartz substrate, the molecules spontaneously assemble into 
standing vessel-shaped single crystals uniaxially and synchronously over the wide area of the substrate, 

with small size polydispersity. The crystal edge is active even after consumption of the molecules and resumes 
stereoselective growth with successive feeding. The resultant morphology can be packed into polycyclic 
aromatic hydrocarbon-like microarchitectures and behaves as a microscopic container. 


oncave polyhedral crystals, typically 
termed “skeletal crystals,” are naturally 
and synthetically ubiquitous crystal forms 
that feature intricate morphologies with 
well-developed crystalline edges and facets 
instead of a body (/-3). The diffusion-limited 
growth and resultant intricate morphology are 
poorly controlled by means of conventional 
methodologies and are therefore of particu- 
lar interest in the field of crystal engineering 
(4-7). Furthermore, as typically shown in snow- 
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flakes and bismuth, skeletal single crystals are 
not only complicated in their morphology but 
also feature geometric symmetry and homo- 
geneous molecular arrangement throughout 
the entire crystalline grain, which distinguishes 
them from polycrystalline, semicrystalline, and 
amorphous solids (8-11). 

Crystal engineering techniques for growing 
skeletal single crystals still lag behind their 
thermodynamic counterparts (72-17). Most ef- 
forts to sculpt skeletal structures have focused 
on amphiphilic surfactants added under thermo- 
dynamic conditions (8-22). These chemical ad- 
ditives are attached to specific crystal facets 
and regulate crystal growth, yielding spiky and 
concave grains. Although these methods are 
useful, the available morphologies are typi- 
cally distinct from those developed under 
kinetic conditions. Controlling the growth 
kinetics of skeletal single crystals has been an 
even more difficult task because their synthetic 
conditions are far from equilibrium. We there- 
fore explore whether fine modulation of the 
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Fig. 1. Morphological and crystallographic investigation of skeletal 
vessel-shaped microcrystals. (A) Schematic diagram of the time-course 
profile of the concentration of ($)-CP4. The concentration initially increases 
as the mother solvent evaporates (purple region) and peaks when nucleation 
and growth of the hexagonal plate begins (cyan region). The crystal growth 
regime changes into edge growth (green region) followed by facet growth 
(orange region), featuring a faster growth rate. After the molecules are mostly 
consumed, the crystal growth halts (red region). (B) Schematic representa- 
tions of the synchronous and uniaxial skeletal crystal growth regimes of 
plate growth, edge growth, and facet growth, corresponding to (ii), (iii), and 
(iv) in Fig. 1A. (€) Molecular structure of (S)-CP4. Me, methyl. (D) SEM 


panel) and 50° (inset). Scale bars: 10 um (main); 2 um (inset). (E) Diffraction 


spots from 
lattice. The 


a single vessel-shaped microcrystal projected onto the reciprocal 
inset shows a photograph of the vessel-shaped microcrystal mounted 


on a thin needle for XRD measurement. Scale bar: 20 wm. (F) Schematic 
representations of the molecular packing arrangement of (S)-CP,4 viewed from 
the c-axis (left) and b-axis (right) directions. The intermolecular interactions are 
visualized with orange circles (x—-m) and blue (C—-H---O) and green (C-H---7) dashed 
lines. The CPK models of (S)-CP, molecules in each layer are colored red, 

orange, green, blue, magenta, and gray in this order. (G@) PXRD pattern of 
microcrystals of ($)-CP,4 grown on a quartz substrate (blue, top), and simulated 


XRD pattern 


from single-crystal XRD analysis (black, bottom). The broad peak 


images of the resultant microcrystals. The sample stage is tilted by 80° (main 


kinetic growth of skeletal crystals is feasible in 
the absence of chemical additives. 

We report skeletal single crystals that grow 
in asynchronous, uniaxial, and stepwise manner 
within a few tens of seconds (Fig. 1, A and B). 
Skeletal crystals are grown from an S-planar- 
chiral aromatic molecule [2.2]paracyclophane 
appended with four (methoxyphenyl)ethynyl 
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arms [(S)-CP4], as well as its enantiomeric 
counterpart [CR)-CP,] (Fig. 1C and fig. S1) 
(23, 24). A heated solution of (S)-CP, (0.10 mg 
in 100 ul of ethanol, 80°C) is gently cast on a 
clean amorphous quartz substrate under am- 
bient condition and covered with a glass dish 
(fig. S2). Immediately after the casting, the 
EtOH solution is cooled by the substrate and 


originates from the amorphous quartz substrate. a.u., arbitrary units. 


the surrounding atmosphere and reaches 
supersaturation, resulting in the formation of 
micrometer-scale particles on the substrate. 
Scanning electron microscopy (SEM) and 
fluorescence microscopy (FM) images reveal 
that the particles form a vessel-like concave 
morphology (Fig. 1D and fig. S3A). The base 
part of the vessel forms a hexagonal plate 
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with average side length and height of 0.8 and 
0.3 um, respectively (fig. S3, B to E). Planar 
lateral sidewalls grow outward with hexagonal 
symmetry on the hexagonal base plate with a 
dihedral angle of 60° relative to the substrate, 
forming a hexagonal pyramidal void inside the 
facets. The thickness of the sidewall (0.5 to 
0.6 um) is nearly constant from bottom to top 
(fig. $3, F to H). 

A vessel-shaped particle is picked up with a 
microneedle and directly subjected to single- 
crystal x-ray structure analysis. The diffraction 
spots projected onto the reciprocal lattice show 
a hexagonally symmetric pattern (Fig. 1E). Con- 
comitantly, the crystal structure of a single 
microvessel is successfully resolved. In the 
crystal, despite its apparent geometrical com- 
plexity, (S)-CP, molecules are arranged in a 
highly symmetric manner with the polar space 
group P6;. In the unit cell, six symmetrically 
equivalent (S)-CP4 molecules stack on one 
another with a counterclockwise rotation of 
60° along a crystallographic sixfold screw 
axis (Fig. IF, table S1, and fig. $4). Orthogonal 
to the screw axis, (S)-CP,4 molecules form a 
unimolecular-thick double-decker aromatic 
sheet with all six aromatic rings virtually parallel 
to the crystallographic ab plane, with an average 
tilt angle of 6.94 + 2.94° 

The (S)-CP, molecules form complemen- 
tary n-n and C-H---O interactions side by side 
within the double-decker sheet (Fig. 1F). The 
interlayers of the double-decker sheets stack 
on each other via multiple C-H--.2 and C-H-:--O 
interactions, although, counterintuitively, no 
obvious n-1 interaction is found between the 
sheets (Fig. 1F and fig. $5). The plane-to-plane 
distances of the adjacent anisolyl rings are 
short (3.045 A) but are largely displaced in the 
crystallographic ab plane and barely overlap. 
Structural analysis of (R)-CP, is likewise 
accomplished, showing a mirror-image molec- 
ular arrangement with the space group P6, 
(table S2 and fig. S6). 

The crystal structure, together with the 
simultaneously indexed crystal facets, reveals 
the molecular arrangement at the crystal sur- 
faces and molecular interactions inside the 
microvessel (Fig. 1F). At the bottom facet of 
the vessel—namely, the crystallographic (001) 
plane—the aromatic rings of the (S)-CP4 
molecules lie parallel to the quartz substrate. 
The lateral facets of the vessel—that is, the 
crystallographic {102} planes—are fully covered 
with the methoxy groups of (S)-CP4, and this 
arrangement is beneficial to facilitating the 
solvophilic effect with EtOH. On the upper 
side of the vessel, concave facets are devel- 
oped instead of the planar (00-1) facet so that 
¢S)-CP, molecules can conceal the nonpolar 
aromatic rings from the polar ELOH molecules. 

Growth into vessel-shaped crystals occurs 
uniformly over a large area on the substrate 
surface with a narrow size distribution [aver- 
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age diameter of the hexagonal inscribed circle 
(hic av): 4.3 um, standard deviation (0): 0.41 um] 
(Fig. 2A and fig. S7). The powder x-ray diffraction 
(PXRD) profile of the microcrystals grown 
on the substrate exhibits a sharp, distinct 
diffraction peak at 27.5° (d = 3.2 A), which is 
attributed to the (0012) plane (Fig. 1G and 
fig. S8). The observed single peak in the PXRD 
pattern unambiguously indicates the uniaxial 
growth of the vessel-shaped single crystals on 
the substrate. 

The monodisperse size distribution of the 
resultant crystals suggests synchronous growth 
of the vessel crystals with a virtually identical 
rate. To address the growth kinetics, in situ 
visualization of the growing particles is con- 
ducted by means of FM (movie S1). At t = 0 s, 
immediately after the casting of the super- 
saturated EtOH solution, blue emission from 
(S)-CP, dissolved in EtOH is homogeneously 
observed in the entire area. Nucleation of the 
hexagonal plate occurs at t = 2.0 s (Fig. 20), 
followed by continuous growth until ¢ = 12 s 
(Fig. 2D). At ¢ = 12 s, (S)-CP, dissolved in 
EtOH is almost consumed, and the growth is 
halted (Fig. 2E). Statistical analyses of dhic ay 
of the crystals show that o (<0.32 um) and the 


polydispersity index (PDI; <0.02) of the par- 
ticle size remain substantially small through- 
out the growth process (Fig. 2B). 

We observe the morphologies of the grow- 
ing particles by removing the mother solvent 
via a spin coater during the growth process. 
SEM images of the growing particles show spiky 
hexapodal edges standing on a thin hexagonal 
plate (fig. S9). Consequently, we conjecture 
how vessel-shaped growth is facilitated at the 
molecular scale [Figs. 1A and 3A, (ii)]. Imme- 
diately after casting, the temperature of the 
solution is still high, and crystal growth proceeds 
under pseudothermodynamic conditions, yield- 
ing a hexagonal plate with its face parallel to the 
substrate. Upon further cooling, the degree of 
supersaturation drastically increases, which alters 
the crystal growth regime to edge growth. Be- 
cause the highly kinetic edge growth con- 
sumes the dissolved (S)-CP, much faster than 
the cooling-induced increase in supersaturation, 
the growth regime changes into facet growth in 
a certain period of time, and the growth rate is 
slow enough to fill the gaps between the spikes 
but too fast to fill the inner void completely. 

The transitions of the crystal growth re- 
gime are further supported by the in situ FM 
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Plate growth 
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Fig. 2. Synchronous crystal growth. (A) Histogram of di, of the microcrystals measured from total 
selected areas | to V in fig. S7. (B) Plots of dhic av (red hexagons) and PDI (blue diamonds) of the 
microcrystals versus t. The standard deviation of dhic ay WaS Measured on the basis of 100 crystals found in 
each snapshot extracted from movie Sl. Purple, cyan, green, orange, and red regions represent 
prenucleation, hexagonal plate growth, edge growth, facet growth, and completion of crystal growth, 
respectively. (C to E) Snapshots of FM images of microcrystals of (S)-CP, at t = 2.0, 6.5, and 14 s after 
drop-casting of the EtOH solution. Scale bars: 10 um. The excitation wavelength (Aex) = 340 to 390 nm. 


5 AUGUST 2022 * VOL 377ISSUE 6606 675 


RESEARCH | REPORTS 
A (i) Edge growth B 
(i) (SCP, feeds 
[(S)-CP4j «2.0 mg mL“ 
(ll) Facet growth 

*, rs 
me ‘ by» . (S)-CP, seeds (ii) (R)-CP, leeds 
. “ / f .~ . — 

ej) * 


Hexagonal plate {(S)-CP4} : 1.0 mg mL" 


(iii) Body growth 


(i) Racemic feeds 


((S}-CPs) 0.4 mg mL 


Fig. 3. Morphology control and hierarchical crystal growth. (A) Schematic representation of the (i) edge-, (ii) facet-, and (iii) body-growth crystallizations with respect to 
the different initial concentrations of (S)-CP,4, and corresponding SEM micrographs of the resultant microcrystals. Scale bars: 3 um. (B) Scheme of the stepwise crystal 

growth and hierarchical microstructures after the addition of EtOH solutions of (i) (S)-CP,, (ii) (R)-CP4, and (iii) their racemic mixture to the vessel-shaped crystal seeds of 
(S)-CP4, and corresponding SEM micrographs of the resultant microstructures. Scale bars: 20 um (left) and 5 um (right) for (i) and (ii); 10 wm (left) and 5 wm (right) for (iii). 


observations (Fig. 2, C to E). In the snapshots, 
the blue emission in the background gradually 
fades and finally becomes dark as the crystals 
grow, representing a decrease in the concen- 
tration of (S)-CP, in the focal plane. Accord- 
ingly, the concentrations of (S)-CP, in the 
EtOH solution are evaluated during the crystal 
growth (fig. S10). The concentration initially 
increases because of the evaporation of EtOH 
and reaches a peak at ¢ = 0.88 s, at which point 
nucleation occurs. Thereafter, as the hexago- 
nal plates grow, the concentration decreases at 
a relatively slow rate. At ¢ = 2.8 s, a transition 
to a faster decrease in concentration occurs, 
which is attributed to the change in the crystal 
growth regime from body growth to edge and 
facet growth (Fig. 1A). 

This insight is highly informative because 
the transition from edge growth to facet growth 
takes place in an experimentally available con- 
centration range. Accordingly, the initial concen- 
tration of (S)-CP, in the feed EtOH solution is 
increased to 2.0 mg ml”, and the solution is 
cast onto a quartz substrate. After drying, the 
substate is covered by crystalline particles, fea- 
turing flower-like hexapodal branches oriented 
outward with Cg symmetry [Fig. 3A, (i), and 
movie S2]. On the other hand, when the solu- 
tion of (S)-CP, is diluted to a concentration 
of 0.4 mg mI”, the resultant crystals feature a 
jewel-like convex polygonal morphology [Fig. 
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3A, (iii)], demonstrating comprehensive con- 
trol of the growth regime from facet growth to 
edge and body growth by changing the initial 
concentration. 

Conventional core/shell-type crystals have 
been synthesized by the successive feeding of 
the constituent molecules (17, 18). Inspired by 
this technique, we conduct successive addition 
of fresh solution to explore whether the vessel 
crystals likewise grow incrementally. Fifteen 
seconds after casting of the first feed solution 
of (S)-CP, (7.0 ul, 1.0 mg m1’) onto a sub- 
strate, the secondary feed solution is gently 
added at identical volume, concentration, and 
temperature (80°C). Although the exterior mor- 
phology of the resultant crystals still features 
a hexagonal vessel shape with smooth facets, 
stepped facets form in the interior, and dhic ay 
increases from 7.0 to 11 um, demonstrating an 
active nature of the crystalline edges at the top 
[Fig. 3B, @)]. This successive crystal growth 
can be repeated at least five times to yield 
five-layered vessel-shaped microcrystals (fig. 
S11). Meanwhile, the single crystallinity of the 
microvessels is kept intact without lattice mis- 
match, as proven by the single-crystal struc- 
ture analysis (figs. S12 and S13, and tables S3 
and S4). The dic ay Value of the microvessels 
observed by FM increases almost linearly as 
the feeding is repeated. Consistently, PXRD 
profiles exhibit a linear increase of peak in- 


tensity derived from the (0012) plane (fig. S14). 
These results suggest a synchronous growth 
rate, even in the sequential growth process. 
We also find that the inner void of the vessel 
can be partially filled by additionally feeding 
a diluted solution (0.4 mg ml’) to the vessel 
crystals (fig. S15). 

We further find that the active edges of the 
vessels can differentiate the stereoisomer in 
the secondary growth process. When a solu- 
tion of GR)-CP, is applied as the secondary 
feed solution to vessel crystals of (S)-CP4, 
irregular and granular crystals emerge at the 
vessel edges [Fig. 3B, (ii)]. The successive growth 
is not operative when a racemic mixture of 
(S)-CP, and (R)-CP, is applied, resulting in 
precipitation of ill-defined solids at the top 
edges of the vessel [Fig. 3B, (iii)]. 

The skeletal and dendritic crystals reported 
thus far are diverse in morphology (4-7). How- 
ever, they are typically polydisperse in size and 
suspended in solvent. By contrast, the vessel- 
shaped crystals reported herein are uniform in 
size and are fixed on a substrate with their open 
windows facing up. We envisage that such a 
standing vessel can be used as a microscopic 
“vessel” that holds a tiny volume of liquid. As 
a proof of concept, a small piece of a solid 
4,4'-dihexyloxy-3-methylazobenzene (AZO) 
crystal is placed inside a vessel. AZO liquifies 


immediately after the photoisomerization 
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Fig. 4. Applications to microcontainers and PAH-like microarchitectures. 
(A) SEM and optical (inset) images of a microvessel in which a piece of an AZO 
microcrystal is placed before (left) and after (right) ultraviolet (UV) light irradiation 
at 340 to 390 nm. Scale bars: 4 um. (B) Optical and FM (insets) images of a 
microvessel in which a PFBT microsphere is placed before (left) and after (right) 


thermal heating at 90°C for 1 min. Scale bars: 5 um. 


reaction upon exposure to ultraviolet light (25) 
(Fig. 4A and fig. S16). As shown in the optical 
and SEM images, AZO is held inside the vessel 
even after the transition to the liquid phase. 
The material scope is successfully expanded to 
two types of polymer particles that melt upon 
heating [poly(9,9-bis((S)-3,7-dimethyloctyl)-2,7- 
fluorene-alt-benzothiadiazole) (PFBT)] (26) and 
hydration [polyvinyl alcohol (PVA) doped 
with Acid Red 52] (Fig. 4, B and C, and figs. 
S17 and S18). 

The microvessels grow synchronously and 
uniaxially, but their locations appear to be 
random in principle. Nonetheless, we occa- 
sionally find fused hexagonal vessels, which 
are reminiscent of the geometries of poly- 
cyclic aromatic hydrocarbons (PAHs). Dyads 
and triads are statistically feasible even ifthe 
locations are totally random [Fig. 4D, (i) to 
(vi)]. Much larger fused polyacene-like objects 
such as tetracene and hexacene are observed 
[Fig. 4D, (vii) to (viii), and fig. S19]. We assume 
that a faint linear scratch or chemical hetero- 
geneity on the substrate facilitates nucleation 
of the crystals, leading to ordered microvessels 
(fig. S20) (27, 28). 

Lastly, we revisit the general issue of skeletal 
crystals—namely, how one can assemble skel- 
etal single crystals with identical size and geo- 
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For FM, Aex = 400 to 


metry. First, the crystalline nuclei should be 
fixed on the substrate throughout the crystal- 
lization. Given a certain area, the time-course 
profiles of temperature and concentration are 
kept homogeneous, leading to synchronous 
growth rate and regime. Uniaxiality is simul- 
taneously achieved when a certain crystalline 
facet preferentially lies on the substrate. This 
is in sharp contrast to a suspension system, in 
which the crystals drift around in a solution 
with certain concentration and temperature 
gradients (fig. S21). Second, the molecules 
should crystallize faster than the evaporation- 
and cooling-induced resupersaturation. Oth- 
erwise, lengthy nucleation, which deteriorates 
synchronous growth, may occur during the 
crystal growth process (fig. S22). 

To validate this strategy, we apply the same 
crystallization procedure to a single-decker 
aromatic molecule (M1) that features four 
(methoxyphenyl)ethynyl arms on a single ben- 
zene ring (29). Upon drop-casting, M1 forms 
crystals in a synchronous and uniaxial manner 
to generate platelet crystals with a unimodal 
size distribution, with its crystalline (100) facet 
facing down to the substrate (figs. S23 and 
$24, table S5, and movie S3), demonstrating 
the generality of the first part of the afore- 


mentioned strategy. Second, the suppression 


440 nm. (€) Optical and FM (insets) images of a microvessel in which 
hygroscopic PVA microspheres are placed before (left) and after (right) exposure 
to humid air. Scale bars: 5 um. For FM, Ae, = 460 to 495 nm. (D) FM images 
of fused microcrystals reminiscent of benzene (i), naphthalene (ii), anthracene 
(iii), phenalene (iv), phenanthrene (v), chrysene (vi), tetracene (vii), and 
hexacene (viii). Scale bars: 3 um. rex = 340 to 390 nm. 


of lengthy nucleation is also important, as 
revealed by the crystallization of (S)-CP, (vw = 
3, 2, 1), which are (S)-CP, analogs that have 
x (methoxyphenyl)ethynyl arm(s) and 4 - v 
phenylethynyl arm(s) (figs. S25 to S33) (30). 
Upon drop-casting, (S)-CPz forms uniaxially 
oriented skeletal crystals with Cg-symmetric 
prominent edges (figs. S34 and $35 and table 
S6). Notably, the nucleation of (S)-CP3 conti- 
nues much longer (20 s) than that of (S)-CP4, 
(2.0 s), resulting in polydisperse size distribu- 
tion (movie S4). 

We also find that the uniaxiality can be 
switched by chemically modifying the surface 
of the substrate. On a methylated quartz sub- 
strate, (S)-CP, crystallizes into a merohedral 
twin of the vessel crystals, with its crystallo- 
graphic c axis parallel to the substrate, while 
maintaining the size uniformity (figs. S36 to 
$38 and tables S7 and S8). 

In conclusion, we find synchronous and 
uniaxial growth of skeletal crystals generated 
from a planar-chiral paracyclophane molecule. 
The crystal growth proceeds synchronously 
over a wide area on the substrate surface. The 
microparticles are single crystals and feature 
monodisperse size distribution and uniform 
vessel-like morphology with well-developed 
facets, whereas their interior remains concave. 
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The methodology is applicable to the crys- 
tallization of aromatic molecules with intri- 
cate features that can be produced within a 
few tens of seconds. 
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fellowship and research opportunities for Phoenix STEM scholars with relevant College organizations. Work with the UChicago Center for Teaching and Learning to 
implement faculty/instructor training on issues around the URM and women students experience in STEM Majors. Collaborate with data scientists to collect data on URM 
STEM student progress and experiences in the College, and design and implement approaches for measuring the success of the program; design supplemental activities 
to enhance interactions between Phoenix scholars and faculty; and write annual reports on the activities, effectiveness, and impact of the program. The Senior Instructional 
Professor/Director will also offer courses ina STEM field. 


Qualifications: 
The position requires a Ph.D. in STEM discipline with 3+ years in academic research or administration; 3+ years of teaching experience at the College level in STEM field 
and training in STEM pedagogy; and training and experience in diversity and inclusion in academia. 


The successful candidate will also have excellent organizational skills; excellent management skills and the ability to lead and motivate a team; excellent writing skills; and 
experience in programmatic assessment (surveys, focus groups, quantitative data). 


To apply for this position candidates must submit their application through the University of Chicago’s Interfolio jobs board at http://apply.interfolio.com/108463 and 
upload a cover letter, current curriculum vitae, contact information for three references. 


Applications will be reviewed until the position is filled. 


Equal Employment Opportunity Statement 

We seek a diverse pool of applicants who wish to join an academic community that places the highest value on rigorous inquiry and encourages diverse perspectives, 
experiences, groups of individuals, and ideas to inform and stimulate intellectual challenge, engagement, and exchange. The University's Statements on Diversity are 
at https://provost.uchicago.edu/statements-diversity. The University of Chicago is an Affirmative Action/Equal Opportunity/Disabled/Veterans Employer and does not 
discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity, national or ethnic origin, age, status as an individual with a disability, protected 
veteran status, genetic information, or other protected classes under the law. For additional information please see the University’s Notice of Nondiscrimination. Job seekers 
in need of a reasonable accommodation to complete the application process should call 773-834-3988 or email equalopportunity@uchicago.edu with their request. 
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ICYS Research Fellow at ICYS, NIMS, Japan 


The International Center for Young Scientists (ICYS) of the National 


Institute for Materials Science (NIMS) invites applications for ICYS 
Research Fellow positions. ICYS will offer you the freedom to conduct 
independent and self-directed research in various areas of materials 
science with full access to NIMS advanced research facilities. 


The common language at ICYS is English. Clerical and technical support 
in English will be given by the ICYS staff. An annual salary starts from 
approximately 5.87 million yen, which may be increased depending on 
the performance of the Research Fellow*. In addition, a research grant 
of 2 million yen per year will be provided to each Research Fellow. The 
initial contract term is two years, which may be extended depending 
on one’s experience and performance. Also, advantage is given when 
applying to NIMS permanent researcher position (about 50% of the 
applicants are accepted). 


All applicants must have obtained a PhD degree within the last ten years. 
Applicants should submit an application form including a research 
proposal during the ICYS term, CV, a list of DOT of journal publications, 
PDF files of three significant publications, and PhD Certificate to 
the ICYS Recruitment Desk by September 29, 2022 JST. The format 
for the application documents can be downloaded from our website. 
The selection will be made on the basis of originality and quality of the 
research proposal as well as the research achievements. Please visit our 
website for more details. 


* Approximately 23% of annual salary will be deducted as social insurance 
premium, residence tax and income tax. 


ICYS Recruitment Desk 
National Institute for Materials Science 
www.nims.go.jp/icys/recruitment/ 


Established in partnership between the Chinese Acade- 
my of Sciences and the Shenzhen Municipal Govern- 
ment, the Shenzhen Institute of Advanced Technology 
(SIAT) is a newly-created university with an objective to 
become the world's preeminent institute for emerging 
science and engineering programs. SIAT is equipped 
with state-of-art teaching and research facilities and is 
dedicated to cultivating international, visionary, and in- 
terdisciplinary talents while delivering research support 
to pursue innovation-driven development. 


SIAT is located in Shenzhen, also known as the "Silicon 
Valley of China,” a modern, clean, and green city, 
well-known for its stunning architecture, vibrant econo- 
my, and its status as a leading global technology hub. 
SIAT is seeking applications for faculty positions of all 
ranks in the following academic programs: Computer 
Science and Engineering, Bioinformatics, Robotics, 
Life Sciences, Material Science and Engineering, Bio- 
medical Engineering, Pharmaceutical Sciences, Syn- 
thetic Biology, Neurosciences, etc. SIAT seeks individ- 
uals with a strong record of scholarship who possess 
the ability to develop and lead high-quality teaching 
and research programs. SIAT offers a comprehensive 
benefits package and is committed to faculty success 
throughout the academic career trajectory, providing 
support for ambitious and world-class research proj- 
ects and innovative, interactive teaching methods. 


Further information: 


online @sciencecareers.org 
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The relevance of science is at an all-time high these days. For anyone 
who’s looking to get ahead in —or just plain get into— science, there’s 
no better, more trusted resource or authority on the subject than 
Science Careers. Here you'll find opportunities and savvy advice 
across all disciplines and levels. There’s no shortage of global 
problems today that science can’t solve. Be part of the solution. 


Careers 


FROM THE JOURNAL SCIENCE JAVAAAS 


Science Careers helps you advance 
your career. Learn how ! 


= Register for a free online account on ScienceCareers.org. 
= Search hundreds of job postings and find your perfect job. 
= Sign up to receive e-mail alerts about job postings that 
match your criteria. 
= Upload your resume into our database and connect 
with employers. 


= Watch one of our many webinars on different career topics 
such as job searching, networking, and more. 


Visit ScienceCareers.org 
today — all resources are free 
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The EGL Charitable Foundation 
invites you to apply to the 


Gruss Lipper Postdoctoral 
Fellowship Program 


Eligibility: 


Israeli citizenship 


Candidates must have completed PhD and/or 
MD/PhD degrees in Biomedical Science at an 
accredited Israeli University/ Medical School, 
or be in their final year of studies 


Candidates must have been awarded a 
postdoctoral position by a U.S. host research 
institution 


Details available at: www.egicf.org 


Application deadline: October 9, 2022 


Who’s the top employer for 2021? 


Science Careers’ annual survey reveals the top companies in biotech 


& pharma voted on by Science readers. 


Read the article and employer profiles at 
sciencecareers.org/topemployers 
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Try, try again 
arlier this summer, my alma mater finally held my Ph.D. commencement, for degrees earned 
during the 2019-20 academic year. I was thrilled to return to campus to celebrate this milestone, 
which had been postponed because of the pandemic. While there, I noticed a poster board titled 
“Unlock your potential,” and one of the sticky notes on the board caught my attention: “Keep 
trying until you get it right.” As a student, I would have easily overlooked this note. But now, it 
resonated with me so strongly that I stopped to reflect on how I got to this point. 


Growing up in rural China to a 
family of farmers, I dreamed of be- 
coming a scientist. Books and news- 
papers became my good friends on 
this path, and I made it to college in 
Beijing. I struggled but managed to 
graduate, though without a sense 
of academic fulfillment in my field 
of engineering. Still, my goal to 
become a scientist remained alive, 
and I decided to pursue a master’s 
degree in a new field, meteoro- 
logy. I was inspired to explore cli- 
mate change and how agriculture 
could adapt, which is vital to farm- 
ers like my family. It was invigorat- 
ing that my work made an impact. 
When I finished, I wanted to try 
something even more ambitious: a 
doctoral degree. 

My aspiration met hard reality 
when I failed my target school’s 
Ph.D. entrance exam. Nearly all my 
friends and professors advised me to put my Ph.D. ambi- 
tions aside and accept a coveted position I had been of- 
fered at a meteorological bureau, observing and reporting 
weather conditions. But I doubted whether this job was 
right for me. I enjoyed research and I worried the job, with 
its emphasis on routine work and little room for creativity, 
would dampen my passion to explore. So, I decided to pur- 
sue a riskier path: applying abroad for doctoral programs. 

While studying to pass the English language tests re- 
quired for those programs, I worked as a research assis- 
tant to earn a living and support my applications. My living 
conditions—all I could afford was a windowless basement 
room that fit just a twin bed and one chair—occasionally 
tempted me to regret my decisions. As I took the English 
tests multiple times but never scored high enough, my 
self-doubt grew. Had I chosen the wrong path? Despite 
the setbacks, the fear of a bigger failure—being jobless the 
following year—motivated me to keep trying. Eventually, I 
passed the tests and was thrilled to be accepted into a doc- 
toral program in the United States. 


“I learned to enjoy the process 
of trying and the unexpected 
opportunities it brought.” 


Once there, I encountered even 
more challenges—language and so- 
cial barriers, cultural differences, 
demanding academic requirements, 
and more. Yet I held the same at- 
titude that lifted me up from the 
basement: Keep trying. In particular, 
communicating proved to be very 
difficult. In Beijing, I was always 
eager to talk to people. But in the 
United States, I often wouldn’t un- 
derstand why something was funny 
even as people around me laughed. 
The frustration made me hesitant 
to attend social events, but my heart 
called on me to not stop trying. For- 
tunately, my classmates continued to 
invite me and would break the ice 
by introducing my roots as a farmer, 
which unexpectedly became an effec- 
tive conversation starter. Over time, I 
spoke fluent English and made new 
friends. I moved on to a postdoc posi- 
tion and then a faculty job search. Similar to the earlier steps 
in my journey, the process was by no means smooth, as I only 
got one interview in the first 2 years, but the lessons I learned 
paved the way to success in the third year. 

On the eve of the Ph.D. commencement, I realized that by 
trying again and again, I have unlocked my potential. Trying 
does not always lead to ideal results, which can be frustrating. 
However, by following my heart, I developed a clearer view of 
how I needed to adapt. By focusing more on what I achieved 
and less on what I missed, I learned to enjoy the process of 
trying and the unexpected opportunities it brought, regard- 
less of how different they were from my initial expectations. 
Without the attempts and adjustments, I would not have 
found my right place in the world and grown into the person 
I wanted to be. 


Huanping Huang is a postdoctoral fellow at Lawrence Berkeley 
National Laboratory and a newly hired assistant professor 

at Louisiana State University, Baton Rouge. Send your career story 
to SciCareerEditor@aaas.org. 
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What's Your Next Career Move? 


From networking to mentoring to evaluating 
your Skills, find answers to your career questions 
on Science Careers 


To view the complete collection, visit 
ScienceCareers.org/booklets 
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Publish your research in the Science family of journals 


The Science family of journals (Science, Science Advances, Science Immunology, Science 
Robotics, Science Signaling, and Science Translational Medicine) are among the most highly- 
regarded journals in the world for quality and selectivity. Our peer-reviewed journals are 
committed to publishing cutting-edge research, incisive scientific commentary, and insights 
on what's important to the scientific world at the highest standards. 


Submit your research today! 
Learn more at Science.org/journals 


